Match groups in Python
Is there a way in Python to access match groups without explicitly creating a match object (or another way to beautify the example below)? Here is an example to clarify my motivation for the question: Following Perl code
if ($statement =~ /I love (\w+)/) < print "He loves $1\n"; >elsif ($statement =~ /Ich liebe (\w+)/) < print "Er liebt $1\n"; >elsif ($statement =~ /Je t\'aime (\w+)/)
m = re.search("I love (\w+)", statement) if m: print "He loves",m.group(1) else: m = re.search("Ich liebe (\w+)", statement) if m: print "Er liebt",m.group(1) else: m = re.search("Je t'aime (\w+)", statement) if m: print "Il aime",m.group(1)
Caveat: Python re.match() specifically matches against the beginning of the target. Thus re.match(«I love (\w+)», «Oh! How I love thee») would NOT match. You either want to use re.search() or explicitly prefix the regex with appropriate wildcard patterns for re.match(«.* I love (\w+)», . )
@S.Lott: oops, you are right. I didn’t see, though I was looking for before posting; nevertheless there are valuable new answers here
5 Answers 5
You could create a little class that returns the boolean result of calling match, and retains the matched groups for subsequent retrieval:
import re class REMatcher(object): def __init__(self, matchstring): self.matchstring = matchstring def match(self,regexp): self.rematch = re.match(regexp, self.matchstring) return bool(self.rematch) def group(self,i): return self.rematch.group(i) for statement in ("I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"): m = REMatcher(statement) if m.match(r"I love (\w+)"): print "He loves",m.group(1) elif m.match(r"Ich liebe (\w+)"): print "Er liebt",m.group(1) elif m.match(r"Je t'aime (\w+)"): print "Il aime",m.group(1) else: print ". "
Update for Python 3 print as a function, and Python 3.8 assignment expressions — no need for a REMatcher class now:
import re for statement in ("I love Mary", "Ich liebe Margot", "Je t'aime Marie", "Te amo Maria"): if m := re.match(r"I love (\w+)", statement): print("He loves", m.group(1)) elif m := re.match(r"Ich liebe (\w+)", statement): print("Er liebt", m.group(1)) elif m := re.match(r"Je t'aime (\w+)", statement): print("Il aime", m.group(1)) else: print()
Check if match group exists without ‘try’?
Using Python 3.8 and regular expressions, is there a way to see if match group 2 exists without having to catch an exception? ‘not None’ doesn’t get evaluated and I get a ‘no such group’ error before that. The function is supposed to pick apart email From addresses that look like ‘»Real Name» foo.bar@example.com’. If there is a real name I would like to get that if not I would do some further checking.
def cleanmystuff(stuff): # removes tabs, spaces, newlines, quotes, returns something regex = r"(.*\S).*<(.*)>" stuff = stuff.replace('"', ' ') stuff = stuff.replace(',', ' ') a_list = stuff.split() allthestuff = " ".join(a_list) matches = re.match(regex, allthestuff) if matches.group(2) is not None: return matches.group(2) # code may go on
3 Answers 3
The answer to this was useful to me in the context of named groups where I don’t even know which regexp (from a list of regular expressions) was executed. All I know is the name of the group. I’ll try to demonstrate a couple of solutions below:
>>> match = re.search(r'(foo)?(bar)?', 'foo') >>> match and match.groups()[1] None >>> match = re.search(r'(foo)?(bar)?', 'foobar') >>> match and match.groups()[1] 'bar'
>>> match = re.search(r'(?Pfoo)?(?Pbar)?', 'foo') >>> match and match.groupdict().get('bar') None >>> match = re.search(r'(?Pfoo)?(?Pbar)?', 'foobar') >>> match and match.groupdict().get('bar') 'bar'
def cleanmystuff(stuff): regex = r"(.*\S).*<(.*)>" stuff = stuff.replace('"', ' ') stuff = stuff.replace(',', ' ') a_list = stuff.split() allthestuff = " ".join(a_list) matches = re.match(regex, allthestuff) if matches and matches.group(2) is not None: return matches.group(2)