Using Table Partitioning Effectively
Introduction
In the previous article, we explored how to use table partitioning. However, you might have realized that we need to create partitions manually. This approach has limitations, including the difficulty of centralized management and the risk of missing partitions if data volume grows excessively large in the future (requiring additional partitions). Therefore, I will guide you on how to create and manage partitions effectively.
Hash Partitioning
- Hash Partitioning operates by evenly distributing data into a fixed number of partitions right from the start using the
MODULUSalgorithm, which includes the following information:MODULUS (M): The total fixed number of partitions into which you want to distribute the data.REMAINDER (R): The identifying remainder for that specific partition (ranging from 0 to M - 1).
- The
MODULUSalgorithm works exactly as its name suggests by calculating the remainder. When inserting a record into the parent table:- Postgres will use a hash function to generate a
Hash Numberbased on the value of the partition column - Then it divides the
Hash Numberby theMODULUSto find the remainder - The resulting remainder determines the actual partition where the data is inserted (
REMAINDER)
- Postgres will use a hash function to generate a
- Consequently, you can see that the total number of partitions created initially is fixed and cannot be changed arbitrarily.
- This limitation occurs because of the remainder operation. The same value yields the same
Hash Numberbut if theMODULUSchanges (such as increasing from 3 to 4, wherehash % 3 != hash % 4), Postgres will fail to locate the correctREMAINDERto retrieve the value - Therefore, the solution is to select a sufficiently large number of partitions right from the beginning to handle future loads, such as 8, 16 or 32 partitions
- Alternatively, you must create a new table to re-partition and migrate the data over
- This limitation occurs because of the remainder operation. The same value yields the same
Other Types of Partitioning
On the other hand, when using Range Partitioning and List Partitioning, even in combination with Sub-partitioning, you can use a PROCEDURE in Postgres to pre-create the necessary partitions required for the future.
Detail
For Hash Partitioning, use a DO BLOCK as follows to pre-create the partitions. You can change the num_partitions value to the corresponding number of partitions you want to create initially.
DO $$
DECLARE
i INT;
num_partitions INT := 8;
BEGIN
FOR i IN 0..(num_partitions - 1) LOOP
EXECUTE format('CREATE TABLE IF NOT EXISTS users_p%s PARTITION OF users FOR VALUES WITH (MODULUS %s, REMAINDER %s);', i, num_partitions, i);
END LOOP;
END $$;
Next is the PROCEDURE used to check the current time and generate corresponding partitions for both the present and the future.
CREATE OR REPLACE PROCEDURE prc_orders_auto_maintain_partitions()
LANGUAGE plpgsql AS $$
DECLARE
current_check_date DATE;
start_date DATE;
end_date DATE;
partition_range_name TEXT;
sub_partition_jp TEXT;
sub_partition_us TEXT;
sub_partition_default TEXT;
i INT;
BEGIN
FOR i IN 0..1 LOOP
current_check_date := CURRENT_DATE + (i * INTERVAL '6 months');
IF EXTRACT(MONTH FROM current_check_date) <= 6 THEN
start_date := DATE_TRUNC('year', current_check_date)::date;
end_date := (start_date + INTERVAL '6 months')::date;
partition_range_name := 'orders_' || TO_CHAR(start_date, 'YYYY') || '_h1';
ELSE
start_date := (DATE_TRUNC('year', current_check_date) + INTERVAL '6 months')::date;
end_date := (DATE_TRUNC('year', current_check_date) + INTERVAL '1 year')::date;
partition_range_name := 'orders_' || TO_CHAR(start_date, 'YYYY') || '_h2';
END IF;
sub_partition_jp := partition_range_name || '_jp';
sub_partition_us := partition_range_name || '_us';
sub_partition_default := partition_range_name || '_default';
IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = partition_range_name) THEN
EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF orders FOR VALUES FROM (%L) TO (%L) PARTITION BY LIST (country);',
partition_range_name, start_date, end_date);
EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES IN (%L);', sub_partition_jp, partition_range_name, 'JP');
EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I FOR VALUES IN (%L);', sub_partition_us, partition_range_name, 'US');
EXECUTE format('CREATE TABLE IF NOT EXISTS %I PARTITION OF %I DEFAULT;', sub_partition_default, partition_range_name);
RAISE NOTICE 'Partition infrastructure successfully created: %', partition_range_name;
ELSE
RAISE NOTICE 'Partition % already exists, skipping creation.', partition_range_name;
END IF;
END LOOP;
END;
$$;
CALL prc_orders_auto_maintain_partitions();
- To use the
PROCEDURE, execute theCALLcommand as shown above. - For example, if the current date is 2026-07-01, this
PROCEDUREwill generate the corresponding partitionsH2 2026andH1 2027for the countriesUSandJP. - You can adjust the PROCEDURE to fit your needs. This way, every 6 months you only need to call the PROCEDURE to pre-create the partitions or use a cron job to handle it automatically.
Happy coding!
Comments
Post a Comment