Compression
Compressing time-series data in a hypertable is a two-step process. First, you need to enable compression on a hypertable by telling TimescaleDB how to compress and order the data as it is compressed. Once compression is enabled, the data can then be compressed in one of two ways:
- Using an automatic policy
- Manually compressing chunks
To enable compression, you need to ALTER the hypertable. There are three parameters you can specify when enabling compression:
timescaledb.compress
(required): enable TimescaleDB compression on the hypertabletimescaledb.compress_orderby
(optional): columns used to order compressed data
If you do not specify compress_orderby
or compress_segmentby
columns, the compressed data is automatically ordered by the hypertable time column.
Use this SQL function to enable compression on the
stocks_real_time
hypertable:View and verify the compression settings for your hypertables by using the
compression_settings
informational view, which returns information about each compression option and itsorderby
andsegmentby
attributes:SELECT * FROM timescaledb_information.compression_settings;
The results look like this:
hypertable_schema|hypertable_name |attname|segmentby_column_index|orderby_column_index|orderby_asc|orderby_nullsfirst|
public |stocks_real_time|symbol | 1| | | |
public |stocks_real_time|time | | 1|false |true |
note
Automatic compression
When you have enabled compression, you can schedule a policy to automatically compress data according to the settings you defined earlier.
For example, if you want to compress data on your hypertable that is older than two weeks, run this SQL:
Similar to the continuous aggregates policy and retention policies, when you run this SQL, all chunks that contain data that is at least two weeks old are compressed in stocks_real_time
, and a recurring compression policy is created.
It is important that you don’t try to compress all your data. Although you can insert new data into compressed chunks, compressed rows can’t be updated or deleted. Therefore, it is best to only compress data after it has aged, once data is less likely to require updating.
Just like for automated policies for continuous aggregates, you can view information and statistics about your compression background job in these two information views:
Policy details:
SELECT * FROM timescaledb_information.job_stats;
While it is usually best to use compression policies to compress data automatically, there might be situations where you need to manually compress chunks.
Use this query to manually compress chunks that consist of data older than 2 weeks. If you manually compress hypertable chunks, consider adding if_not_compressed=>true
to the function. Otherwise, TimescaleDB shows an error when it tries to compress a chunk that is already compressed:
Verify your compression
You can check the overall compression rate of your hypertables using this query to view the size of your compressed chunks before and after applying compression:
SELECT pg_size_pretty(before_compression_total_bytes) as "before compression",
pg_size_pretty(after_compression_total_bytes) as "after compression"
FROM hypertable_compression_stats('stocks_real_time');
Sample results:
|before compression|after compression|
|------------------|-----------------|
|326 MB |29 MB |
Your overview of TimescaleDB is almost complete. The final thing to explore is , which allows you to drop older raw data from a hypertable quickly without deleting data from the precalculated continuous aggregate.
Learn more about compression
For more information on how native compression in TimescaleDB works, as well as the compression algorithms involved, see this in-depth blog post on the topic: .
For more information, see the compression docs.