HomeMathComputingArtsWordsLiteratureMusictwitter facebook webfeed

Text Pattern Matching in Emacs (emacs regex tutorial)

Advertise Here For Profit

Xah Lee, 2007-08, 2009-08-20, 2011-01-13

Emacs's regex is not based on Perl or Python's, but is very similar. In emacs regex, the parenthesis characters () are literal. If you want to capture a pattern, you need to escape the paren like this: \(myPattern\).

Common Patterns

Here are some common patterns:

PatternMatches
.any single character
\.one period
[0-9]+sequence of digits
[A-Za-z]+sequence of letters
[-A-Za-z0-9]+sequence of letter, digit, hyphen
[_A-Za-z0-9]+sequence of letter, digit, underscore
[-_A-Za-z0-9]+sequence of letter, digit, hyphen, underscore
[[:blank:]]+sequence of tabs and spaces
[[:upper:]]+sequence of cap letters
[[:lower:]]+sequence of lowercase letters
"\([^"]+?\)"capture text between double quotes (non-greedy)
“\([^”]+?\)”capture text between curly double quotes (non-greedy; unicode char)
(\([^)]+?\))capture text between parenthesis (non-greedy)
+means match previous pattern 1 or more times
*means match previous pattern 0 or more times
?means match previous pattern 0 or 1 time

Differences from Perl's Regex

If you are familiar with Perl's regex, here are some practical major differences.

Test Your Regex

Emacs has a interactive regex mode. It show matches as you type. To go into the mode, call “regexp-builder”.

Alternatively, you can call “query-replace-regexp” to test your pattern.

Test Regex for Elisp

To test regex in your elisp code, you can open a empty file and place the regex function at top and the text you want to match below it, like this:

(search-forward-regexp "yourRegex")

whatever text here

Then, put your cursor to the right of the closing parenthesis, then call “eval-last-sexp”. If your regex matches, it'll move cursor to the last char of the matched text. If you get a lisp error saying search failed, then your regex didn't match. If you get a lisp syntax error, then you probably screwed up on the backslashs.

Double Backslash in Lisp Code

In a lisp regex function that takes a regex string (e.g. “search-forward-regexp”), you will need to use double backslash. This is because, in elisp string, a backslash needs to be prefixed with a backslash, then, this interpreted string is passed to emacs's regex engine.

For example, suppose you have this text:

Sin[x] + Sin[y]

and you need to capture the x or y. You can use:

(search-backward-regexp "\\(\\[[a-z]\\]\\)")

The regex engine really just got:

\(\[[a-z]\]\)
blog comments powered by Disqus