Compression

Compressing time-series data in a hypertable is a two-step process. First, you need to enable compression on a hypertable by telling TimescaleDB how to compress and order the data as it is compressed. Once compression is enabled, the data can then be compressed in one of two ways:

  • Using an automatic policy
  • Manually compressing chunks

To enable compression, you need to ALTER the hypertable. There are three parameters you can specify when enabling compression:

  • timescaledb.compress (required): enable TimescaleDB compression on the hypertable
  • timescaledb.compress_orderby (optional): columns used to order compressed data

If you do not specify compress_orderby or compress_segmentby columns, the compressed data is automatically ordered by the hypertable time column.

  1. Use this SQL function to enable compression on the stocks_real_time hypertable:

  2. View and verify the compression settings for your hypertables by using the compression_settings informational view, which returns information about each compression option and its orderby and segmentby attributes:

    1. SELECT * FROM timescaledb_information.compression_settings;
  3. The results look like this:

    1. hypertable_schema|hypertable_name |attname|segmentby_column_index|orderby_column_index|orderby_asc|orderby_nullsfirst|
    2. public |stocks_real_time|symbol | 1| | | |
    3. public |stocks_real_time|time | | 1|false |true |
note

Automatic compression

When you have enabled compression, you can schedule a policy to automatically compress data according to the settings you defined earlier.

For example, if you want to compress data on your hypertable that is older than two weeks, run this SQL:

Similar to the continuous aggregates policy and retention policies, when you run this SQL, all chunks that contain data that is at least two weeks old are compressed in stocks_real_time, and a recurring compression policy is created.

It is important that you don’t try to compress all your data. Although you can insert new data into compressed chunks, compressed rows can’t be updated or deleted. Therefore, it is best to only compress data after it has aged, once data is less likely to require updating.

Just like for automated policies for continuous aggregates, you can view information and statistics about your compression background job in these two information views:

Policy details:

  1. SELECT * FROM timescaledb_information.job_stats;

While it is usually best to use compression policies to compress data automatically, there might be situations where you need to manually compress chunks.

Use this query to manually compress chunks that consist of data older than 2 weeks. If you manually compress hypertable chunks, consider adding if_not_compressed=>true to the function. Otherwise, TimescaleDB shows an error when it tries to compress a chunk that is already compressed:

Verify your compression

You can check the overall compression rate of your hypertables using this query to view the size of your compressed chunks before and after applying compression:

  1. SELECT pg_size_pretty(before_compression_total_bytes) as "before compression",
  2. pg_size_pretty(after_compression_total_bytes) as "after compression"
  3. FROM hypertable_compression_stats('stocks_real_time');

Sample results:

  1. |before compression|after compression|
  2. |------------------|-----------------|
  3. |326 MB |29 MB |

Your overview of TimescaleDB is almost complete. The final thing to explore is , which allows you to drop older raw data from a hypertable quickly without deleting data from the precalculated continuous aggregate.

Learn more about compression

For more information on how native compression in TimescaleDB works, as well as the compression algorithms involved, see this in-depth blog post on the topic: .

For more information, see the compression docs.