Math Notation, Proof System, Computer Algebra, in One Language

By Xah Lee. Date: . Last updated: .

This essay calls for a computer language with a formal syntax that is a merge of several existing technologies:

Here are the reasons.

Traditional Math Notation Lacks Grammar

• Traditional math notation is very inconsistent and lacks grammar. Edsger Dijkstra is a leader in a movement of what's called Calculational Proofs. That is, using a notation that is consistent and facilitates the calculation aspects when doing math by humans.

(e.g. The Problems of Traditional Math Notation)

Most Computer Languages's Syntax Lacks Grammar

Almost all computer languages's syntax, are ad hoc, inconsistent, and lacks grammar. e.g. the worst are bash, perl, PowerShell. Then, C, C++, Java, C#, TeX, including functional langs such as OCaml, Haskell, are all completely ad hoc.

A syntax that's consistent, with certain sytematic grammar, is very good. A good example are lisp, Mathematica, XML.

[see Formal Definition of Systematic Grammar]

Math Foundation, Formalism, Logicism, is Today's Proof Languages

Proof languages, such as HOL Light, Coq, Isabelle, Mizar, are today's actuation of the math foundation work of Formalism and Logicism (e.g. Principia Mathematica.).

Today's work on math foundation needs to be done on a computer, so that axioms and deductions are in some absolute, formal, sense.

(See also: State of Theorem Proving Systems 2008.)

Proof Languages and Computer Algebra System

Works done in math proof languages (e.g. HOL Light, Coq, Isabelle, Mizar), and computer algebra systems (e.g. Mathematica, Maple), often they require each other, yet, typically these two systems do not co-exist as one language.

It is relative easy, to create a language that does both in a coherent manner. This is no more difficult than the difficulty of writing each system.

Also, numeral systems (e.g. Mathcad, Matlab), should also be part of it.

Math Display System

• Today, especially since 1990s, tremendous advances are made in computer algebra systems and theorem proving systems. In these languages, a coherent syntax, grammar, are needed for math expressions.

• Math expressions/syntax in computer languages are intimately tied to math notations for human reading. (e.g. Mathematica, MathML are technologies that combine the two.)

• The syntax and grammar of today's computer languages, such as Java, C, Python, SQL, Lisp, are ad-hoc and their communities have little understanding of the knowledge gained in math related fields such as computer algebra or theorem proving languages. (This applies to functional langs such as Haskell as well, but to a lesser degree.) On the other hand, mathematicians in general are illiterate about programing or using computer languages.

All of the above considered together, computer language designers and mathematicians, should be made aware of these issues, so that when they design or use computer languages, may it be math oriented or not, the language's syntax and grammar can move towards a consistent syntax system with solid foundation (as opposed to ad-hoc), and such language should have build-in markup or simple mapping to 2-dimensional notations for human reading (such as done with Mathematica or Semantic MathML), and this computer language should be in fact as a basis of theorem prover or computer algebra system (as in OCaml, Haskell, Mathematica). The languages of computer algebra and theorem prover would in fact merge together into one single subject if it is not already slowly happening today.

Progress in the above issues are made in different fields but there are little unification going on.

e.g. there's Edsger Dijkstra's Calculational Proofs movement. It improves math notations towards consistency and formalism. However, people in Calculational Proofs movement are mostly math pedagogy community i think. They are not programers interested in computer languages, nor logicians interested in math formalism, or industrial and commercial organizations interested in math notation representation systems.

There's the computer algebra community, such as Mathematica, Maple, Matlab, which requires a syntax and grammar for mathematical concepts. There's the theorem proving community, such as OCaml, Coq, HOL, which not only requires a syntax for math concepts, but also made major understanding about math as a system of forms, i.e. formal systems. Both computer algebra and theorem proving systems require math notations and computer language syntax that are consistent and can represent math concepts. However, the 2 camps are largely separate communities. e.g. there is as far as i know no tool that is both a practical computer algebra system as well as a theorem proving language.

Common computer languages, such as C, Java, Python, requires a good syntax, parsers, and compilers, but their community, including computer scientists and programers, are usually illiterate in typical topics of mathematics proper. Functional languages, such as Scheme Lisp, APL, OCaml, Haskell, are more based on logic foundations (lambda calculus) but their syntax and grammar has little to do with the math notations as a logic or formal system. (these languages do not have a formal spec in the sense of Formlism, i.e. transformation of forms. In fact, almost no languages has a formal spec, formalism or not.)

There's math notation representation needs, such as TeX, Mathcad, MathML, Mathematica. These are typically commercial organizations in the computing industry. They can render math notations. In the case of MathML and Mathematica, the language also represent the semantic content of math notations. These two made major understanding about the relation of math notations and computer languages, but they in general have little to do with formalism or theorem proving. (with some exception of Mathematica)

Calculational proof notational system, Computer algebra systems, theorem prover languages, formalism and logicism as foundation of mathematics, functional languages, and computer languages in general, mathematics and its notations, all are in fact can be considered as a single subject with a unified goal. All the technologies and movements exist, but today they have mostly not come together. e.g. Microsoft Equaton Editor, TeX, and various other tools does well with math notation rendering. MathML has both representational and semantic aspects (OpenMath is a new group that focus on semantic aspects), for the purpose of rendering as well semantic representation. Mathematica is a computer algebra system for solving arbitrary math equations, that is also able to represent math notation as a computer language, so that computation can be done with math notation directly. However, the system lacks a foundation as a theorem prover. Theorem provers such as OCaml (HOL, Coq), Mizar does math formalism as a foundation, also function as a generic computer language, but lacks abilities as a computer algebra system or math notation representation.


On Sep 5, 7:41 am, slawekk [skoko…@yahoo.com] wrote:

Isabelle's presentation layer is well integrated with LaTeX and you can use math notation in (presentation of) proofs.

The point is, that formalism in mathematics, consistency of math notation issues (for human), math notation language systems (TeX, Mathematica, MathML), and calculational proof movement (e.g. Edsger Dijkstra), and computer algebra systems, and theorem proving systems, and computer language syntax, all of the above, should be unified into one single entity, and is today very much doable, in fact much of it is happening in disparate communities, but as far as i know i do not think there's any literature that expresses the idea that they should all be rolled into one.

few things to note:

Theorem proving systems and computer algebra systems as unified tool is very much a need and is already happening. (e.g. there's a project i forgot the name now that tries to make Mathematica into a theorem proving system like HOL.)

Theorem proving languages (isabell, hol, coq etc) and mathematics foundation by formalism should be unified. This active research the past 30 or more years, and is the explicit goal of the various theorem proving systems.

Math notation consistency issues for human communication, as the calculational proof movement by Dijkstra, and also Stephen Wolfram criticism of traditional notation and Mathematica's StandardForm, is actually one single issue. They should be known as one single issue, instead of Calculational Proof movement happening only in math pedagogy community and Mathematica in its own community.

Math notation issues and computer language syntax and logic notation syntax is also largely the same issue. Computer languages, or all computer languages, should move towards a formalized syntax system. I don't think there's much literature on this aspect (in comparison to other issues i mentioned in this essay). Most of today's computer language's syntax are ad hoc, with no foundation, no consistency, no formal approach. e.g. especially bad ones are Ocaml, and all C-like langs such as C, C++, Java. Shell langs are also good examples of extremely ad hoc, for example: bash, perl, PowerShell. There are langs that are more based on a consistent syntax system that are more or less can be reduced to a formal approach. Of those i know includes Mathematica, XML (and lots derivatives, for example: MathML) and lisps also. Other langs i don't know much but whose syntax i think are also close to a formal system are: APL, tcl.

My use of the phrase “syntax with formal foundation” or “syntax system” is a bit fuzzy and needs more thinking and explanation… but basically, the idea is that computer language's syntax can be formalize in the same way we trying to formalize mathematics (albeit the problem is much simpler), so that the syntax and grammar can be reduced to few very simple formal rules in parsing it, is consistent, easy to understand. Mathematica and XML are excellent examples. (note here that such syntax system does not imply they look simple. Mathematica is a example.)

the following 2 articles helps in explaining this:

Systems for displaying math, such as TeX, Mathematica, MathML, should be unified as part of the computer language's syntax. The point is that we should not have a lang just to generate the display of math notations such as done by TeX or MathML or Microsoft equation editor or other tools. Rather, it should be part of the general language for doing math. (e.g. any computer algebra system or theorem proving system)

A good example that already have done this since ~1997 is Mathematica.

Practically speaking, this means, when you code in a language (say, Ocaml), you don't just write code, but you can dynamically, interactively, have your code display math 2D notations, and the info about formatting the notation is part of the computer language, it's just that your IDE or specialized editor understand your language and can format it to render 2D notations on the fly (e.g. HTML is such a lang).

If you know Mathematica, then you understand this. Otherwise, think of HTML/XML, which is a lang for formatting purposes without computational ability, yet today there are XML based general purpose computer languages. This language is a example of several issues in this essay. i.e. it's syntax is formalized syntax system, it's is a general purpose computer language, and it has semantics for 2D notations or arbitrary formatting/rendering such as headers.


As a example of current situation in contrast of the above idea: suppose you doing some proof using OCaml derived theorem prover. Sometimes you need to do computer algebra, so you need to call Mathematica or Maple as supplement. Then often you need to display the result in math notation. So you'll need to call/output TeX or MathML. Then Dijkstra objects that your traditional math notation is so inconsistent, ambiguous, misleading, ad hoc, and does not help or correspond to the actual mathematical content behind them. So, you need to invent or re-write your notation to something proposed by the Calculational Proofs movement or Stephen Wolfram's (proprietary) Mathematica's StandardForm, that is not ambiguous.

I think what inspired me to arrive at this idea is mostly my experiences with Mathematica, and my interest in math formalism and logicism as foundation, and my interest in technologies such as computer algebra systems and display systems such as MathML and TeX, and the intricate issues of relation between math notations and mathematics.

This may sounds like pitching Mathematica, but as far as i know it is closest as the best example in unifying all these issues. It is a has a simple syntax system (i.e. the lang's syntax and grammar is not ad hoc). It is a general computer language. It is a computer algebra system (e.g. can solve math equations, etc.). The language also functions as a math notation display system (e.g. like TeX or MathML). It has a notation (StandardForm) that is compatible with the calculational proof movement.

What it lacks is functioning as a theorem proving system.

I'm singling out Mathematica because it is a system i know well and happened to be the most fitting example in this thesis. Note however, Mathematica is roughly the sole idea of Stephen Wolfram, and its syntax/grammar, is not the only approach. It just happens to be the lang that today has unified many of the issues in this essay (as far as i know). It is relatively easy to design alternative syntax.

Many approaches to this unified language/syntax/notation/mathematics system are possible. Different communities mentioned above are trying to unify or advance different aspects. (e.g. as another example i haven't mentioned above, there's a project in LaTeX that tries to make its syntax understand the semantics of math notations, as opposed to sequence of structurally meaningless symbols that renders to meaningless display… Lots other examples in different tools really)

I'll have to refine this essay for coherency and more concrete examples, perhaps with screen shots from different tools, syntax examples in different languages, rendered output in different tools, notation comparison from different schools, philosophies in formalism or logicism or computer proofing systems from different mathematicians, pertinent quotations and excerpts from various literature, and more academic references and industrial publications… but i hope this idea is conveyed reasonably.