Using Table Partitioning Effectively

Introduction

In the previous article, we explored how to use table partitioning. However, you might have realized that we need to create partitions manually. This approach has limitations, including the difficulty of centralized management and the risk of missing partitions if data volume grows excessively large in the future (requiring additional partitions). Therefore, I will guide you on how to create and manage partitions effectively.

Hash Partitioning

  • Hash Partitioning operates by evenly distributing data into a fixed number of partitions right from the start using the MODULUS algorithm, which includes the following information:
    • MODULUS (M): The total fixed number of partitions into which you want to distribute the data.
    • REMAINDER (R): The identifying remainder for that specific partition (ranging from 0 to M - 1).
  • The MODULUS algorithm works exactly as its name suggests by calculating the remainder. When inserting a record into the parent table:
    • Postgres will use a hash function to generate a Hash Number based on the value of the partition column
    • Then it divides the Hash Number by the MODULUS to find the remainder
    • The resulting remainder determines the actual partition where the data is inserted (REMAINDER)
  • Consequently, you can see that the total number of partitions created initially is fixed and cannot be changed arbitrarily.
    • This limitation occurs because of the remainder operation. The same value yields the same Hash Number but if the MODULUS changes (such as increasing from 3 to 4, where hash % 3 != hash % 4), Postgres will fail to locate the correct REMAINDER to retrieve the value
    • Therefore, the solution is to select a sufficiently large number of partitions right from the beginning to handle future loads, such as 8, 16 or 32 partitions
    • Alternatively, you must create a new table to re-partition and migrate the data over

Other Types of Partitioning

On the other hand, when using Range Partitioning and List Partitioning, even in combination with Sub-partitioning, you can use a PROCEDURE in Postgres to pre-create the necessary partitions required for the future.

Detail

For Hash Partitioning, use a DO BLOCK as follows to pre-create the partitions. You can change the num_partitions value to the corresponding number of partitions you want to create initially.

DO $$
DECLARE 
    i INT;
    num_partitions INT := 8;
BEGIN
    FOR i IN 0..(num_partitions - 1) LOOP
        EXECUTE format('CREATE TABLE IF NOT EXISTS users_p%s PARTITION OF users FOR VALUES WITH (MODULUS %s, REMAINDER %s);', i, num_partitions, i);
    END LOOP;
END $$;


Next is the PROCEDURE used to check the current time and generate corresponding partitions for both the present and the future.

CREATE OR REPLACE PROCEDURE prc_orders_auto_maintain_partitions()
LANGUAGE plpgsql AS $$
DECLARE
    current_check_date DATE;
    start_date DATE;
    end_date DATE;
    partition_range_name TEXT;
    sub_partition_jp TEXT;
    sub_partition_us TEXT;
    sub_partition_default TEXT;
    i INT;
BEGIN
    FOR i IN 0..1 LOOP
        current_check_date := CURRENT_DATE + (i * INTERVAL '6 months');
        
        IF EXTRACT(MONTH FROM current_check_date) <= 6 THEN
            start_date := DATE_TRUNC('year', current_check_date)::date;
            end_date := (start_date + INTERVAL '6 months')::date;
            partition_range_name := 'orders_' || TO_CHAR(start_date, 'YYYY') || '_h1';
        ELSE
            start_date := (DATE_TRUNC('year', current_check_date) + INTERVAL '6 months')::date;
            end_date := (DATE_TRUNC('year', current_check_date) + INTERVAL '1 year')::date;
            partition_range_name := 'orders_' || TO_CHAR(start_date, 'YYYY') || '_h2';
        END IF;

        sub_partition_jp := partition_range_name || '_jp';
        sub_partition_us := partition_range_name || '_us';
        sub_partition_default := partition_range_name || '_default';

        IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = partition_range_name) THEN
            EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF orders FOR VALUES FROM (%L) TO (%L) PARTITION BY LIST (country);', 
                partition_range_name, start_date, end_date);
            EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES IN (%L);', sub_partition_jp, partition_range_name, 'JP');
            EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES IN (%L);', sub_partition_us, partition_range_name, 'US');
            EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I DEFAULT;', sub_partition_default, partition_range_name);
            RAISE NOTICE 'Partition infrastructure successfully created: %', partition_range_name;
        ELSE
            RAISE NOTICE 'Partition % already exists, skipping creation.', partition_range_name;
        END IF;
    END LOOP;
END;
$$;

CALL prc_orders_auto_maintain_partitions();
  • To use the PROCEDURE, execute the CALL command as shown above.
  • For example, if the current date is 2026-07-01, this PROCEDURE will generate the corresponding partitions H2 2026 and H1 2027 for the countries US and JP.
  • You can adjust the PROCEDURE to fit your needs. This way, every 6 months you only need to call the PROCEDURE to pre-create the partitions or use a cron job to handle it automatically.

Happy coding!

See more articles here.

Comments

Popular posts from this blog

All Practice Series

Kubernetes Deployment for Zero Downtime

Deploying a NodeJS Server on Google Kubernetes Engine

Sitemap

React Practice Series

A Handy Guide to Using Dynamic Import in JavaScript

Helm for beginer - Deploy nginx to Google Kubernetes Engine

DevOps Practice Series

Docker Practice Series

Setting up Kubernetes Dashboard with Kind