Create a table which is partitioned across data nodes by the ‘location’ column. Note that the number of space partitions is automatically equal to the number of data nodes assigned to this hypertable (all configured data nodes in this case, as data_nodes
is not specified).
SELECT create_distributed_hypertable('conditions', 'time', 'location',
Best practices
Space partitions: As opposed to the normal , space partitions are highly recommended for distributed hypertables. Incoming data is divided among data nodes based upon the space partition (the first one if multiple space partitions have been defined). If there is no space partition, all the data for each time slice is written to a single data node.
For example, assume you are ingesting 10 GB of data per day and you have five data nodes, each with 64 GB of memory. If this is the only table being served by these data nodes, then you should use a time interval of 1 week (7 * 10 GB / 5 * 64 GB ~= 22% main memory
used for most recent chunks).
Replication factor: The hypertable’s replication_factor
defines to how many data nodes a newly created chunk is replicated. That is, a chunk with a replication_factor
of three exists on three separate data nodes, and rows written to that chunk are inserted (as part of a two-phase commit protocol) to all three chunk copies. For chunks replicated more than once, if a data node fails or is removed, no data is lost, and writes can continue to succeed on the remaining chunk copies. However, the chunks present on the lost data node are now under-replicated. Currently, it is not possible to restore under-replicated chunks, although this limitation might be removed in a future release. To avoid such inconsistency, we do not yet recommend using replication_factor
> 1, and instead rely on physical replication of each data node if such fault-tolerance is required.