Named Groups of Regular Expressions

Make use of regular expressions more readable with named groups.


In [8]:
m.group(2)


Out[8]:
'Mackenzie'

In [16]:
m.group('first_name')


Out[16]:
'Mackenzie'

The Zen of Python, by Tim Peters

Readability counts.


In [1]:
import re

Regular expressions can be used to indicate if a string matches a pattern or not.

Regular expressions can also be used to do some parsing. The substrings of interest are called groups. The traditional way of referring to a group is by index number. Python has another way of referring to a group by name.

Using names give both the regular expression and references to match groups more meaning. They make Python code more readable.


In [2]:
foo_pattern = re.compile('''
    ^
    ([A-Za-z]+)
    ,[ ]
    ([A-Za-z]+)
    $
''', re.VERBOSE)

In [3]:
s = 'James, Mackenzie'

In [4]:
m = re.match(foo_pattern, s)
m


Out[4]:
<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>

In [5]:
m.groups


Out[5]:
<function SRE_Match.groups>

In [6]:
m.group(0)


Out[6]:
'James, Mackenzie'

In [7]:
m.group(1)


Out[7]:
'James'

In [8]:
m.group(2)


Out[8]:
'Mackenzie'

In [9]:
foo_pattern = re.compile('''
    ^
    (?P<last_name>[A-Za-z]+)
    ,[ ]
    (?P<first_name>[A-Za-z]+)
    $
''', re.VERBOSE)

In [10]:
m = re.match(foo_pattern, s)
m


Out[10]:
<_sre.SRE_Match object; span=(0, 16), match='James, Mackenzie'>

In [11]:
m.groups


Out[11]:
<function SRE_Match.groups>

In [12]:
m.group(0)


Out[12]:
'James, Mackenzie'

In [13]:
m.group(1)


Out[13]:
'James'

In [14]:
m.group(2)


Out[14]:
'Mackenzie'

In [15]:
m.group('last_name')


Out[15]:
'James'

In [16]:
m.group('first_name')


Out[16]:
'Mackenzie'

Questions

  1. Was Python first to name groups in regular expressions?

  2. Catherine asked why there is a capital P in named group syntax.

Eric found Named regular expression group “(?Pregexp)”: what does “P” stand for? article, which addresses both questions.

  1. Yes, Python was first.
  2. The 'P' seems to stand for Python, but we do not really know.