Tag Archives: Python Regular Expressions

REGULAR EXPRESSIONS IN PYTHON

Regular Expressions in Python

A regular expression is a distinctive sequence of characters that helps the user to match or find other strings or sets of strings, using a specialized syntax held in a pattern. Regular expressions are widely used in UNIX world. The module re offers full support for Perl-like regular expressions in Python.

In this article we will be covering important functions, which would be used to handle regular expressions.

IN PYTHON A REGULAR EXPRESSION SEARCH IS TYPICALLY WRITTEN AS

match = re.search(pat, str)

The re.search() method takes a regular expression pattern and a string and searches for that pattern within the string. If the search is successful, search() returns a match object or None. Therefore, the search is usually followed by an if-statement to test if the search is succeeded or not.

str = ‘an example word:cat!!’
match = re.search(r’word:\w\w\w’, str)
# If-statement after search() tests if it succeeded
if match:
print ‘found’, match.group() ## ‘found word:cat’
else:
print ‘did not find’

The code match = re.search(pat, str) stores the search result in a variable named “match”. Then the if-statement tests the match — if true the search succeeded and match.group() is the matching text (e.g. ‘word:cat’). Otherwise if the match is false, then the search will not be succeeded, and there is no matching text.

The ‘r’ at the start of the pattern string designates a python “raw” string which passes through backslashes without change which is very convenient for regular expressions.

if-statement tests the match — if true the search succeeded and match.group() is the matching text (e.g. ‘word:cat’). If the match is false, then the search will not be succeeded, and there is no matching text.

The ‘r’ at the start of the pattern string designates a python “raw” string which passes through backslashes without change which is very convenient for regular expressions.

THE MATCH FUNCTION

The function attempts to match re pattern to string with optional flags.

Here is the syntax for this function:
re.match(pattern, string, flags=0)

The re.match function returns a match object on success. The usegroup(num) or groups() function is used to match objects to get matched expression.

The match function

#!/usr/bin/python
import re
line = “Cats are smarter than dogs”
matchObj = re.match( r'(.*) are (.*?) .*’, line, re.M|re.I)
if matchObj:
print “matchObj.group() : “, matchObj. group ()
print “matchObj.group(1) : “, matchObj. group (1)
print “matchObj.group(2) : “, matchObj. group (2)
else:
print “No match!!”

When the above code is executed, it gives the following result:

matchObj.group() : Cats are smarter than dogs
matchObj.group(1) : Cats
matchObj.group(2) : smarter