Expression Index
Introduction
- Expression Index is an extremely powerful feature of PostgreSQL, which can be used for any data type (such as text, numbers, time, arrays and more) as long as the result of the expression returns a Deterministic value
- You can use Expression Index for all index types (
B-Tree,Hash,GIN,GiSTand more) as long as the expression you write in theCREATE INDEXcommand returns a data type that the Index type supports for processing - The nature of Expression Index is that instead of indexing on the original column value, Postgres precomputes the result of the function or expression as soon as you
INSERT/UPDATEand saves the result directly into the Index file
Characteristics
- After creating an Expression Index, when using a query, you must write it exactly like the expression in the Index, otherwise the index will not work
- Every time a row of data is modified, Postgres must run that function or expression to get the result to create the Index. Therefore, avoid using functions that are too heavy, such as crypto encryption or complex hashing
- Only allows creating Expression Indexes with functions that must be immutable, meaning that with the same input, it must always return a single unique output. You cannot use functions whose output can change on its own, such as
now(),current_dateorrandom() - When creating an index, it will store a physical file on the disk independent of the existing indexes, so consider carefully when creating to avoid wasting storage space for unused indexes
Detail
You should create the table and index as follows
CREATE TABLE test (
email VARCHAR(255),
username VARCHAR(100),
first_name VARCHAR(50),
last_name VARCHAR(50),
secret_key TEXT,
created_at TIMESTAMPTZ,
meta_value VARCHAR(50),
title TEXT,
content TEXT,
metadata JSONB,
price NUMERIC,
discount_rate NUMERIC
);
-- 1
CREATE INDEX idx_test_email_lower ON test (LOWER(email)) INCLUDE (email);
-- 2
CREATE INDEX idx_test_username_trim ON test (TRIM(username));
-- 3
CREATE INDEX idx_test_fullname ON test ((first_name || ' ' || last_name));
-- 4
CREATE INDEX idx_test_created_date ON test (DATE_TRUNC('day', created_at AT TIME ZONE 'UTC'));
-- 5
CREATE INDEX idx_test_created_month ON test (EXTRACT(MONTH FROM created_at AT TIME ZONE 'UTC'));
-- 6
CREATE INDEX idx_test_discounted_price ON test ((price * (1 - discount_rate)));
-- 7
CREATE INDEX idx_test_token_hash ON test USING hash (md5(secret_key));
-- 8
CREATE INDEX idx_test_string_to_int ON test (((meta_value::INTEGER)));
-- 9
CREATE INDEX idx_test_fts ON test USING gin (to_tsvector('english', title || ' ' || content));
-- 10
CREATE INDEX idx_test_json_tags ON test USING gin ((metadata->'info'->'tags') jsonb_path_ops);
-- 11
CREATE INDEX idx_test_date_brin ON test USING brin (DATE_TRUNC('day', created_at AT TIME ZONE 'UTC'));
- Index 1: Uses LOWER function, the INCLUDE part will store the raw email into the Index as well
- Index 2: Uses TRIM function
- Index 3: Concatenates strings to create Fullname
- Index 4: Uses DATE_TRUNC function to get only the
daypart inTIMESTAMPTZtype (for example, the value2026-06-20 15:30:45+00will become2026-06-20 00:00:00)AT TIME ZONE 'UTC'means converting to UTC time (Coordinated Universal Time), equivalent toUTC+00:00
- Index 5: Converts to UTC time and takes only the month value to create the Index
- Index 6: Precalculates the discounted price using
(price * (1 - discount_rate))to create the Index - Index 7: Passes the values of the
secret_keycolumn into themd5hash function and takes the result to create the Index - Index 8: Casts the value of the
meta_valuecolumn toINTEGERto create the Index - Index 9: Creates a GIN Index with the content being the concatenation result of
titleandcontent, supporting Full-Text Search - Index 10: Creates a GIN Index from the value of the field
'info'->'tags'in JSONB. Because `tags` is an array, you can create a GIN Index, whereas if the data type is a primitive value, you can create a B-Tree Index. - Index 11: Creates a BRIN Index with the value range being only the
daypart, rather than taking the hour, minute, second and timezone parts
-- Query 1
SELECT email FROM test WHERE LOWER(email) = 'adrian.gutmann@gmail.com';
-- Query 2
SELECT * FROM test WHERE TRIM(username) = 'Kent78';
-- Query 3
SELECT * FROM test WHERE (first_name || ' ' || last_name) = 'Damon Kessler';
-- Query 4
SELECT * FROM test WHERE DATE_TRUNC('day', created_at AT TIME ZONE 'UTC') = '2026-06-19';
-- Query 5
SELECT * FROM test WHERE EXTRACT(MONTH FROM created_at AT TIME ZONE 'UTC') = 6;
-- Query 6
SELECT * FROM test WHERE (price * (1 - discount_rate)) < 200.00;
-- Query 7
SELECT * FROM test WHERE md5(secret_key) = '3b27b3d9f412f9bab5053ce12c01e9ef';
-- Query 8
SELECT * FROM test WHERE ((meta_value::INTEGER)) = 661807;
-- Query 9
SELECT * FROM test WHERE to_tsvector('english', title || ' ' || content) @@ to_tsquery('english', 'ubi & voro');
-- Query 10
SELECT * FROM test WHERE (metadata->'info'->'tags') @> '"postgres"'::jsonb;
-- Query 11
SELECT * FROM test WHERE DATE_TRUNC('day', created_at AT TIME ZONE 'UTC') BETWEEN '2026-06-01' AND '2026-06-19';
- Query 1: Searches by
email, case-insensitive- Normally when using an Expression Index, because the Index only stores the value after transformation, there will be no raw data of the column, so when executing any query, it is an
Index Scan(including scan in Index and Heap scan) - Because when creating the Index for this column above,
INCLUDE (email)was added, this Index already contains the raw value of the email - Therefore, when querying this field, you will see it is an
Index Only Scan, which means no Heap scan is needed anymore - But other queries below only have an
Index Scan
- Normally when using an Expression Index, because the Index only stores the value after transformation, there will be no raw data of the column, so when executing any query, it is an
- Query 2: Searches for
username, automatically removing extra whitespaces - Query 3: Concatenates strings to search by full name
- Query 4: When using
2026-06-19to query, Postgres will automatically convert it to2026-06-19 00:00:00before comparing- The
DATE_TRUNC('day', created_at AT TIME ZONE 'UTC')part will not have to convert the values in thecreated_atcolumn anymore (since it was done when creating the Index), this is just the syntax to identify which Index needs to be used
- The
- Query 5: Filters records created in June
- Query 6: Finds products with a post-discount price cheaper than 200.00
- Query 7: Authenticates Token using MD5 hash string
- The value passed in
3b27b3d9f412f9bab5053ce12c01e9efis the result after hashing with MD5
- The value passed in
- Query 8: Searches after casting to Integer type
- Query 9: Searches content where the concatenation result of
titleandcontentmust simultaneously contain both wordsubiandvoro - Query 10: Finds JSONB records containing the element
postgreslocated inside the tags array. - Query 11: Searches by date range
The result is as follows
Happy coding!
Comments
Post a Comment