Nested Syntax: Brackets vs Begin End Keywords vs Indentation
Which Syntax Design is Best for Nested Syntax
Here's 3 programing language syntax for nesting.
Bracket style
[11, [21, 22], 31]
Begin End Keywords style
begin 11, begin 21, 22 end , 31 end
Indentation style
11 21 22 31
Bracket Style is Best
- Most easy to type.
- Allows automatic formating to any desired style of indentation.
- Logical clarity.
- Most easy to parse by syntax parser.
Of all characters, there's a unique property of brackets such as ( ) [ ] { }. Namely, they embed info on matching another char in text. This means, any editor, language, or parser, can trivially extract such info. Lisp uses this to advantage, and from it came widely praised lisp macros. Similar with JSON.
Begin End Keywords Style
- Visually cluttered.
- Nesting structure is hard to recognize.
- More work for editor parsers to parse. Therefore, almost no editor has features that edit it by tree branch semantics.
languages that use {begin
, end
}, such as {Ruby, Pascal}, means there need to be semantic info added to such string, and more work for parser or text editor to understand the nesting structure.
Indentation Syntax is Worst
- Most difficult to type. For each node in a tree, you have to put in on a line with number of indent corresponding to the node's level.
- Most difficult to parse.
- Most difficult to edit, even with editor help, because in order to unnest or move element to other level, you have to change with all the level's indent, for each line.
- Removed any possibility of automatic formatting.
- Fixed the code into a ascii art style.
- Prone to syntax error if missing a space.
- Prone to syntax error if missing a space.
languages using indentation for blocks and nesting, such as {Python, Haskell, HAML, YAML, Slim} is non-trivial to parse. It requires a dedicated parser to understand the nesting.
Indentation syntax is also complex, prone to invalid syntax, because the nesting info must be embedded in each and every line of a nested structure.
Indentation syntax has another problem, which is most serious: Namely, it mixes semantic and syntactical info. The source code is hard-formatted in one fixed way, and must always be manually formatted. For a damage due to this, see: Why Python Lambda is Broken and Can't be Fixed.
See also:
- Layout Syntax Considered Troublesome
- By Yin Wang.
- https://yinwang0.wordpress.com/2011/05/08/layout/
- Nested Syntax: XML vs LISP
- What Does it Mean When a Programing Language Claims “Whitespace is Insignificant”?
- Concepts and Confusions of Prefix, Infix, Postfix and Lisp Notations
- What Are Good Qualities of Computer Language Syntax?
- What is Function, What is Operator?
- Problems of Symbol Congestion in Computer Languages; ASCII Jam vs Unicode
- Programing Language Design: String Syntax