And that’s pretty much it. Since lists are syntactically so trivial, the only remaining syntactic rules you need to know are those governing the form of different kinds of atoms. In this section I’ll describe the rules for the most commonly used kinds of atoms: numbers, strings, and names. After that, I’ll cover how s-expressions composed of these elements can be evaluated as Lisp forms.
Numbers are fairly straightforward: any sequence of digits—possibly prefaced with a sign ( or -
), containing a decimal point (.
) or a solidus (/
), or ending with an exponent marker—is read as a number. For example:
These different forms represent different kinds of numbers: integers, ratios, and floating point. Lisp also supports complex numbers, which have their own notation and which I’ll discuss in Chapter 10.
Strings literals, as you saw in the previous chapter, are enclosed in double quotes. Within a string a backslash (\
) escapes the next character, causing it to be included in the string regardless of what it is. The only two characters that must be escaped within a string are double quotes and the backslash itself. All other characters can be included in a string literal without escaping, regardless of their meaning outside a string. Some example string literals are as follows:
Names used in Lisp programs, such as **FORMAT**
and hello-world
, and are represented by objects called symbols. The reader knows nothing about how a given name is going to be used—whether it’s the name of a variable, a function, or something else. It just reads a sequence of characters and builds an object to represent the name.6 Almost any character can appear in a name. Whitespace characters can’t, though, because the elements of lists are separated by whitespace. Digits can appear in names as long as the name as a whole can’t be interpreted as a number. Similarly, names can contain periods, but the reader can’t read a name that consists only of periods. Ten characters that serve other syntactic purposes can’t appear in names: open and close parentheses, double and single quotes, backtick, comma, colon, semicolon, backslash, and vertical bar. And even those characters can, if you’re willing to escape them by preceding the character to be escaped with a backslash or by surrounding the part of the name containing characters that need escaping with vertical bars.
Two important characteristics of the way the reader translates names to symbol objects have to do with how it treats the case of letters in names and how it ensures that the same name is always read as the same symbol. While reading names, the reader converts all unescaped characters in a name to their uppercase equivalents. Thus, the reader will read foo
, Foo
, and FOO
as the same symbol: FOO
. However, \f\o\o
and |foo|
will both be read as foo
, which is a different object than the symbol FOO
. This is why when you define a function at the REPL and it prints the name of the function, it’s been converted to uppercase. Standard style, these days, is to write code in all lowercase and let the reader change names to uppercase.7
Because names can contain many more characters in Lisp than they can in Algol-derived languages, certain naming conventions are distinct to Lisp, such as the use of hyphenated names like hello-world
. Another important convention is that global variables are given names that start and end with *
. Similarly, constants are given names starting and ending in +
. And some programmers will name particularly low-level functions with names that start with or even %%
. The names defined in the language standard use only the alphabetic characters (A-Z) plus *
, +
, -
, /
, 1
, 2
, <
, =
, >
, and &
.
The syntax for lists, numbers, strings, and symbols can describe a good percentage of Lisp programs. Other rules describe notations for literal vectors, individual characters, and arrays, which I’ll cover when I talk about the associated data types in Chapters 10 and 11. For now the key thing to understand is how you can combine numbers, strings, and symbols with parentheses-delimited lists to build s-expressions representing arbitrary trees of objects. Some simple examples look like this:
An only slightly more complex example is the following four-item list that contains two symbols, the empty list, and another list, itself containing two symbols and a string: