Above the level of individual bytes, most binary formats use a smallish number of primitive data types—numbers encoded in various ways, textual strings, bit fields, and so on—which are then composed into more complex structures. So your first task is to define a framework for writing code to read and write the primitive data types used by a given binary format.
To take a simple example, suppose you’re dealing with a binary format that uses an unsigned 16-bit integer as a primitive data type. To read such an integer, you need to read the two bytes and then combine them into a single number by multiplying one byte by 256, a.k.a. 2^8, and adding it to the other byte. For instance, assuming the binary format specifies that such 16-bit quantities are stored in big-endian3 form, with the most significant byte first, you can read such a number with this function:
(ldb (byte 8 0) #xabcd) ==> 205 ; 205 is #xcd
To get the next octet, you’d use a byte specifier of (byte 8 8)
like this:
You can use **LDB**
with **SETF**
to set the specified bits of an integer stored in a **SETF**
able place.
CL-USER> (defvar *num* 0)
*NUM*
CL-USER> (setf (ldb (byte 8 0) *num*) 128)
128
CL-USER> (setf (ldb (byte 8 8) *num*) 255)
255
65408
To write a number out as a 16-bit integer, you need to extract the individual 8-bit bytes and write them one at a time. To extract the individual bytes, you just need to use **LDB**
with the same byte specifiers.
(defun write-u2 (out value)
(write-byte (ldb (byte 8 8) value) out)
(write-byte (ldb (byte 8 0) value) out))
Of course, you can also encode integers in many other ways—with different numbers of bytes, with different endianness, and in signed and unsigned format.