Hypertable and chunk architecture
Hypertables partition data into chunks by time, and optionally by space.
A hypertable is composed from many child tables, called chunks. Each chunk has a time constraint, and only contains data from that time range. When you insert data into a hypertable, TimescaleDB automatically creates chunks based on the time values of your data.
For example, assume each chunk contains 1 day’s worth of data. For each day for which you have data, TimescaleDB creates a chunk. For each row of data, it looks at the time column and inserts the data into the right chunk. All rows with time values from the same day are inserted into the same chunk. Rows belonging to different days are inserted into different chunks.
This happens behind the scenes. You run regular inserts and queries to your database, and TimescaleDB automatically handles the partitioning.
note
This section uses the example of 1-day chunks to explain how hypertables work. The default chunk time interval is actually 7 days, and you can change it to suit your data ingest patterns. To learn about best practices for chunk time intervals, see the documentation on .
All TimescaleDB hypertables are partitioned by time. In addition, they might also be partitioned by other columns. This is called time-and-space partitioning.
Time-and-space partitioning is most useful for distributed hypertables. These are hypertables in multi-node databases, where the hypertable data is distributed among nodes. Space partitions aren’t usually recommended for single-node databases, where they don’t offer much benefit and may even degrade performance.
In distributed hypertables, time-and-space partitioning allows for parallel inserts and queries. For example, say that one node stores data from device A and another stores data for device B. At a certain time, you get new data for both device A and device B. You can write device A’s data to the first node, and write device B’s data to the second node, at the same time.
For more information, see the section on .
Hypertables are made of chunks. Each chunk is itself a standard PostgreSQL table. In PostgreSQL terminology, the hypertable is a parent table and the chunks are its child tables.
To enforce each chunk’s time boundaries, the chunk includes time constraints. For example, a specific chunk might only contain data with time values within .
TimescaleDB catalogs the chunk constraints and uses them to optimize database operations. When a row is inserted, the planner looks at its time value, finds the correct chunk, and routes the insertion there. When a query is made, the planner pushes the query down to only affected chunks. For example, if a query has a clause specifying , the database only executes the query against chunks covering the past week. It excludes older chunks.
All of this happens in the background. From your perspective, the hypertable should look just like a regular PostgreSQL table.
To choose how the space values map to partitions, TimescaleDB uses one of two methods:
- Interval-based partitioning. This works the same way as time-based partitioning. All the values in a specified range fall into the same partition.
Like time partitions, space partitions never overlap.
Rather than building a global index over an entire hypertable, TimescaleDB builds local indexes on each chunk. In other words, each chunk has its own index that only indexes data within that chunk. This optimization improves insert speed for recent data. For more information, see the section on the .
Even with multiple local indexes, TimescaleDB can still ensure global uniqueness for keys. It enforces an important constraint: any key that requires uniqueness, such as a , must include all columns that are used for data partitioning.
In other words, because data is partitioned between chunks based on time value, the unique key must include the time value. When data is inserted, TimescaleDB identifies the corresponding time chunk. Using that chunk’s unique index, it checks for uniqueness within the chunk. Because no other chunk can contain that time value, uniqueness within the chunk implies global uniqueness.
If another column is used for partitioning, the same logic applies. TimescaleDB identifies the correct chunk using the time-and-space partitions and checks for uniqueness using the chunk’s unique index. If the unique index contains both the time and space partitioning parameters, then those values can appear in no other partition. Uniqueness within the partition implies global uniqueness.
These checks happen in the background. As a user, you run a regular command, and the correct index is automatically updated.