Text Editor's Cursor Movement Behavior (emacs, vi, Notepad++)

By Xah Lee. Date: . Last updated: .

This article discusses some differences of cursor movement behavior among editors. That is, when you press Ctrl+ or Ctrl+, on a line of programing language code with lots of different sequence of symbols, where exactly does the cursor stop at?

Always End at Beginning of Word?

Type the following in your favorite text editor.

something in the water does not compute

Now, you can try the word movement in different editors.

I tested this on Notepad, Notepad++, vim, emacs, Mac's TextEdit.

In Notepad, Notepad++, vim, the cursor always ends at the beginning of each word.

In emacs, TextEdit, Xcode, they end in the beginning of the word if you are moving left, but ends at the end of the word if you are moving right.

That's the first major difference.

Does Movement Depends on the Language Mode?

Now, try this line:

something !! in @@ the ## water $$ does %% not ^^ compute

Now, vim and Notepad++ 's behavior are identical. Their behavior is pretty simple and like before. They simply put the cursor at the beginning of each string sequence, doesn't matter what the characters are. Notepad is similar, except that it will move into between %%.

Emacs, TextEdit behaved similarly. Emacs will skip the symbol clusters !!, @@, ##, ^^ entirely, while stopping at boundaries of $$ and %%. (when emacs is in text-mode) TextEdit will stop in middle of $$ and ^^, but skip the other symbol clusters entirely.

I don't know about other editors, but i understand the behavior of emacs well. Emacs has a syntax table concept. Each and every character is classified into one of “whitespace”, “word”, “symbol”, “punctuation”, and others. When you press Ctrl+, emacs calls backward-word, it simply move untill it reaches a char that's not in the “word” group.

Each major mode's value of syntax table are usually different. So, depending on which mode you are in, it'll either skip a character sequence of identical chars entirely, or stop at their boundary.

Syntax Tables (ELISP Manual)

The question is whether other editor's word movement behavior changes depending on what language mode it is currently in. And if so, how the behavior changes? do they use a concept similar to emacs's syntax table?

In Notepad++, cursor word-motion behavior does not change with respect to what language mode you are in. Some 5 min test shows nor for vim.

More Test Cursor Movement Test

Now, create a file of this content for more test.

something in the water does not compute
something !! in @@ the ## water $$ does %% not ^^ compute
something!!in@@the##water$$does%%not^^compute
(defun insert-p-tag () "Insert <p></p> at cursor point."
  (interactive) (insert "<p></p>") (backward-char 4))
for (my $i = 0; $i < 9; $i++) { print "done!";}
<a><b>a b c</b> d e</a>

Answer this:

Which is More Efficient?

Now, the interesting question is which model is more efficient for general everyday coding of different languages.

First question is: is it more efficient in general for left/right word motions to always land in the left boundary the word as in vim, Notepad, Notepad++ ?

Certainly i think it is more intuitive that way. But, there' a flaw. This scheme, you won't be able to place cursor at end of the word by just pressing forward/backward-word commands.

The second question is: whether it is good to have the movement change depending on the language mode.

It seems, not depending on language mode is more intuitive. Because the behavior is predictable everywhere. Though, of course it MAY be less efficient, because logically one'd think that it might be better to have word motion behavior adopt to different language. But in my experience, it seems cursor movement depending on lang mode doesn't really have practical benefits. I can't think of any lang mode where i want backward/forward-word to behave in a special way. (this does not include special modes, such as dired, shell, chat client, etc, which isn't really text editing.)

From my experience, emacs syntax table is very annoying. In theory, it's a good thing, as it provides flexibility. But in practice, It thwarts user expectation in cursor movement, and the design is really lousy, as you cannot use it to describe syntax at all, and the char classes it provides is really narrow minded, and the syntax is extremely complex and hard to work with.

This article is inspired from Paul Drummond question in gnu.emacs.help

How to Change Cursor Movement in Emacs

On , Elena [egarr…@gmail.com] wrote:

is there some elisp code to move by tokens when a programming mode is
active? For instance, in the following C code:

double value = f ();

the point - represented by | - would move like this:

|double value = f ();
double |value = f ();
double value |= f ();
double value = |f ();
double value = f |();
double value = f (|);
double value = f ()|;

c-mode has functions c-forward-token-1 and c-forward-token-2. (thanks to Andreas Politz)

It is easy to write a elisp code to move to different definition of word boundary. See: Emacs: Move Cursor to Bracket 🚀.

Emacs Modernization