Regular expressions built in via the re module. Super simple example:
In [7]:
import re
match = re.search(r'\d+', r'abc123def') # note the "r" prefix
print match.span() # what do the numbers represent?
| name | description |
|---|---|
\d |
any digit, i.e., [0-9] |
\D |
any non-digit, i.e., [^0-9] |
\s |
any whitespace, i.e., [ \t\n\r\f\v] |
\S |
any non-whitespace, i.e., [^ \t\n\r\f\v] |
\w |
alphanumeric, i.e., [a-zA-Z0-9_] |
\W |
non alphanumeric, i.e., [^ a-zA-Z0-9_] |
| name | description |
|---|---|
. |
any character but \n |
^ |
match at beginning or class complement |
$ |
match at ending |
* |
match 0 or more times |
? |
match 0 or 1 times |
\ |
escape character |
| |
"or" |
[] |
defines character class, e.g., [a-z] |
{} |
for repeated qualifier, e.g., ab{2,3} |
() |
for groups |
In [8]:
pattern = r'ca*t'
print re.match(pattern, r'ct').span()
print re.match(pattern, r'cat').span()
print re.match(pattern, r'caaat').span()
print re.match(pattern, r'go cats!')
In [9]:
print re.match(r'ca*[\w ]+t', r'catenkerous cat!').span()
This highlights the fact that * is greedy. In other words, it grabs as large a match as possible.
Now, for the fun stuff. Do
cd /path/to/ME701_examples
git pull
You should now have a new folder re with some fun, real-world data to munge!