Data Types
Simple types only contain one type. This makes them very simple to serialize/deserialize. All simple types have the following layout:
For an unicode string ‘sayan’, the layout of the unicode string type (+
) will look like:
+5\n # 'sayan' is an unicode string, so '+' and has 5 bytes so '5'
sayan\n # the element 'sayan' itself
Table
Do keep the matching for this symbol non-exhaustive since we might add more types in future revisions of the protocol.
Compound types
Compound types are derived types — they are based on simple types, but often with some additional properties (and serialization/deserialization differences).
Type symbol (tsymbol) | Type | Additional notes | Protocol |
---|---|---|---|
& | Array | A recursive array | 1.0 |
_ | Flat array | A non-recursive array | 1.0 |
@ | Typed array | An array of a specific type, with nullable elements | 1.1 |
~ | Any array | An array with a single type but no information about the type | 1.1 |
^ | Typed non-null array | A non-recursive array with non-null elements | 1.1 |
Array
See the full discussion on arrays here.
A flat array is like an array, but with the exception that it is non-recursive. This means that a flat array can contain all types except other compound types (hence the name ‘flat’).
So if you represent an array in a programming language like:
["hello", 12345, "world"];
then it will be serialized by Skyhash into:
note
Typed array
A typed array is like a flat array, but with the exception that it can only hold two types: either a or a NULL
. Since this array just has a specific type in its declaration, unlike flat arrays, tsymbol
s are not required.
You can think of it to be like:
- or there is an element of the declared type
Say a programming language represents an array like:
["omg", NULL, "happened"]
then it will be serialized by Skyhash into:
@+3\n
omg\n
\0\n
8\n
happened\n
Line-by-line explanation:
@+3\n
because it is a typed array, so@
, the elements are unicode strings, so+
and there are three elements, so3
3\n
because ‘omg’ has 3 bytesomg\n
, the element itself\0\n
,NULL
because there was no element8\n
because ‘happened’ has 8 byteshappened\n
, the element itself
note
An AnyArray
is like a typed array — but without any explicit information about the type that is sent. Currently, all the element types have to be the same, but however, no information about the type has to be sent. It is upto the server to convert them to the correct types. This makes running actions extremely simple as the clients don’t have to specify the type. The server will convert it into the appropriate type for that action. No matter how flexible this may sound — AnyArray
s are extremely performant. Also, no element in an AnyArray
can be null.
If you have a programming language that represents a singly-typed array like:
then Skyhash will serialize it into:
5\n
2\n
is\n
6\n
hiking\n
Line-by-line explanation:
~3\n
because this is anAnyArray
with 3 elements5\n
because ‘sayan’ has 5 bytessayan\n
, the element ‘sayan’ itself2\n
because ‘is’ has 2 bytesis\n
the element ‘is’ itself6\n
because ‘hiking’ has 6 byteshiking\n
the element ‘hiking’ itself
note
An AnyArray
is currently a query specific data type (only sent by the client and never by the server)
Typed non-null array
A typed non-null array is just like a typed array, except for one thing — its elements can never be null. Say you have an array of three strings like this:
["super", "wind"]
Then it will be represented like this:
Line-by-line explanation:
^+2\n
because this a typed non-null array, with two string elements5\n
because the first element is “super” and has 5 charssuper\n
the element itself4\n
the second element is “wind” and has 4 chars