Xah Lee, 2008-06, 2010-03-11, 2010-09-01
This page shows some common examples of emacs lisp for batch text processing. Typically the type of tasks one would do in unix shell tools or Perl. For example, find & replace on a list of given files or dir, process (small sized) log files, compile a bunch of files, generating a report.
Open a file, process it, save, close it.
; open a file, process it, save, close it (defun my-process-file (fpath) "process the file at fullpath FPATH …" (let (mybuffer) (setq mybuffer (find-file fpath)) (goto-char (point-min)) ;; in case buffer already open ;; do something (save-buffer) (kill-buffer mybuffer)))
For processing hundreds of files, you don't need emacs to keep undo info or fontification. It is more efficient to insert file content into a temp buffer. Like this:
(defun my-process-file (fpath) "Process the file at path FPATH …" (let () ;; create temp buffer without undo record. ;; first space in temp buff name is necessary (set-buffer (get-buffer-create " myTemp")) (insert-file-contents fpath nil nil nil t) ;; process it … ;; (goto-char 1) ; move to begining of file's content ;; … ;; (write-file fpath) ;; write back to the file (kill-buffer " myTemp")))
To read a whole file into a list of lines, you can use this code:
(defun read-lines (fpath) "Return a list of lines of a file at at FPATH." (with-temp-buffer (insert-file-contents fpath) (split-string (buffer-string) "\n" t)))
Once you have a list, you can use “mapcar” to process each element in the list. If you don't need the resulting list, use “mapc”.
Note: in elisp, it's more efficient to process text in a buffer than doing complicated string manipulation with string data type. But, in most situations, it's simpler to deal with a list of line strings. For a example of line by line processing in a buffer, see: Process a File line-by-line in Emacs Lisp.
Commonly used functions to manipulate file names.
(file-name-directory f) ; get dir path (file-name-nondirectory f) ; get file name (file-name-extension f) ; get suffix (file-name-sans-extension f) ; remove suffix (file-relative-name f ) ; get relative path (expand-file-name f ) ; get full path default-directory ; get the current dir (this is a variable)
Commonly used functions to manipulate files and dirs.
(rename-file FILE NEWNAME &optional OK-IF-ALREADY-EXISTS) (copy-file FILE NEWNAME &optional OK-IF-ALREADY-EXISTS KEEP-TIME PRESERVE-UID-GID) (delete-file FILE) (set-file-modes FILE MODE)
;; get list of file names (directory-files DIR &optional FULL MATCH NOSORT) ;; create a dir. Non existent paren dirs will be created (make-directory DIR &optional PARENTS) ;; copy/delete whole dir (delete-directory DIRECTORY &optional RECURSIVE) ; RECURSIVE option new in emacs 23.2 (copy-directory DIR NEWNAME &optional KEEP-TIME PARENTS) ; new in emacs 23.2
How to find the current executing program's name?
(or load-file-name buffer-file-name)
if you want the full path, call file-name-directory on the result.
Example: make backup file.
(defun make-backup () "Make a backup copy of current buffer's file. Create a backup of current buffer's file. The new file name is the old file name postfixed with “~”, in the same dir. If such a file already exist, append more “~”. If the current buffer is not associated with a file, its a error." (interactive) (let (cfile bfilename) (setq cfile (buffer-file-name)) (setq bfilename (concat cfile "~")) (while (file-exists-p bfilename) (setq bfilename (concat bfilename "~")) ) (copy-file cfile bfilename t) (message (concat "Backup saved as: " (file-name-nondirectory bfilename))) ) )
; idiom for calling a shell command (shell-command "cp /somepath/myfile.txt /somepath") ; idiom for calling a shell command and get its output (shell-command-to-string "ls")
Both shell-command and shell-command-to-string will wait for the shell process to finish before continuing. To not wait, use start-process or start-process-shell-command.
(info "(elisp) Asynchronous Processes")
In the following, my-process-file is a function that takes a file full path as input. The find-lisp-find-files will generate a list of full paths, using a regex on file name. The “mapc” will apply the function to elements in a list.
; idiom for traversing a directory (require 'find-lisp) (mapc 'my-process-file (find-lisp-find-files "~/web/emacs/" "\\.html$"))
You can run a elisp program in the Operating System's command line interface (shell), using the “--script” option. For example:
emacs --script process_log.el
Emacs has few other options and variations to control how you run a elisp script. Here's a table of main options:
| full option name | meaning |
|---|---|
| --no-site-file | Do not load the site wide “site-start.el”. |
| --no-init-file | Do not load your init files 〔~/.emacs〕 or “default.el”. |
| --batch | Run emacs in batch mode, use it together with “--load” to specify a lisp file. This implies “--no-init-file” but not “--no-site-file”. |
| --load="‹elisp file path›" | Execute the elisp file at “‹elisp file path›”. |
| --script ‹file path› | Run emacs like “--batch” with “--load” set to “‹file path›”. |
| --user=‹user name› | Load user ‹user name›'s emacs init file (the “.emacs”). |
The “site-start.el” is a init file for site wide running of emacs. It pretty much means a init file for all users of this emacs installation. It may be added by a sys admin, or it may be part of a particular emacs distribution (e.g. Carbon Emacs, Aquamacs Emacs, ErgoEmacs …). You can usually find this file in the directory where emacs is installed, if it exists.
When you write a elisp script to run in batch, make sure your elisp file is self-contained, doesn't call functions in your emacs init file, call to load all libraries it needs (using “require” or “load”), has necessary load path set in the script (e.g. “(add-to-list 'load-path ‹lib path›)”), just like you would with a Perl or Python script.
If you've done a clean job in your elisp script, then, all you need to use is “emacs --script ‹elisp file path›”.
If your elisp program requires functions that you've defined in your emacs init file (the “.emacs”), then you should explicitly load it in your script by “(load ‹emacs init file path›)”, or, you can add the option to load it, like this: “--user=xah”. (best to actually pull out the function you need)
If you are on a Mac with Carbon Emacs or Aquamacs, call it from the command line like this:
/Applications/Emacs.app/Contents/MacOS/Emacs --script=process_log.el
2010-06-04 Thanks to Rubén Berenguel for a correction.
For some practical examples of batch style text processing, see: