Expression Index

Introduction

  • Expression Index is an extremely powerful feature of PostgreSQL, which can be used for any data type (such as text, numbers, time, arrays and more) as long as the result of the expression returns a Deterministic value
  • You can use Expression Index for all index types (B-Tree, Hash, GIN, GiST and more) as long as the expression you write in the CREATE INDEX command returns a data type that the Index type supports for processing
  • The nature of Expression Index is that instead of indexing on the original column value, Postgres precomputes the result of the function or expression as soon as you INSERT/UPDATE and saves the result directly into the Index file

Characteristics

  • After creating an Expression Index, when using a query, you must write it exactly like the expression in the Index, otherwise the index will not work
  • Every time a row of data is modified, Postgres must run that function or expression to get the result to create the Index. Therefore, avoid using functions that are too heavy, such as crypto encryption or complex hashing
  • Only allows creating Expression Indexes with functions that must be immutable, meaning that with the same input, it must always return a single unique output. You cannot use functions whose output can change on its own, such as now(), current_date or random()
  • When creating an index, it will store a physical file on the disk independent of the existing indexes, so consider carefully when creating to avoid wasting storage space for unused indexes

Detail

You should create the table and index as follows

CREATE TABLE test (
    email VARCHAR(255),
    username VARCHAR(100),
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    secret_key TEXT,
    created_at TIMESTAMPTZ,
    meta_value VARCHAR(50),
    title TEXT,
    content TEXT,
    metadata JSONB,
    price NUMERIC,
    discount_rate NUMERIC
);

-- 1
CREATE INDEX idx_test_email_lower ON test (LOWER(email)) INCLUDE (email);

-- 2
CREATE INDEX idx_test_username_trim ON test (TRIM(username));

-- 3
CREATE INDEX idx_test_fullname ON test ((first_name || ' ' || last_name));

-- 4
CREATE INDEX idx_test_created_date ON test (DATE_TRUNC('day', created_at AT TIME ZONE 'UTC'));

-- 5
CREATE INDEX idx_test_created_month ON test (EXTRACT(MONTH FROM created_at AT TIME ZONE 'UTC'));

-- 6
CREATE INDEX idx_test_discounted_price ON test ((price * (1 - discount_rate)));

-- 7
CREATE INDEX idx_test_token_hash ON test USING hash (md5(secret_key));

-- 8
CREATE INDEX idx_test_string_to_int ON test (((meta_value::INTEGER)));

-- 9
CREATE INDEX idx_test_fts ON test USING gin (to_tsvector('english', title || ' ' || content));

-- 10
CREATE INDEX idx_test_json_tags ON test USING gin ((metadata->'info'->'tags') jsonb_path_ops);

-- 11
CREATE INDEX idx_test_date_brin ON test USING brin (DATE_TRUNC('day', created_at AT TIME ZONE 'UTC'));
  • Index 1: Uses LOWER function, the INCLUDE part will store the raw email into the Index as well
  • Index 2: Uses TRIM function
  • Index 3: Concatenates strings to create Fullname
  • Index 4: Uses DATE_TRUNC function to get only the day part in TIMESTAMPTZ type (for example, the value 2026-06-20 15:30:45+00 will become 2026-06-20 00:00:00)
    • AT TIME ZONE 'UTC' means converting to UTC time (Coordinated Universal Time), equivalent to UTC+00:00
  • Index 5: Converts to UTC time and takes only the month value to create the Index
  • Index 6: Precalculates the discounted price using (price * (1 - discount_rate)) to create the Index
  • Index 7: Passes the values of the secret_key column into the md5 hash function and takes the result to create the Index
  • Index 8: Casts the value of the meta_value column to INTEGER to create the Index
  • Index 9: Creates a GIN Index with the content being the concatenation result of title and content, supporting Full-Text Search
  • Index 10: Creates a GIN Index from the value of the field 'info'->'tags' in JSONB. Because `tags` is an array, you can create a GIN Index, whereas if the data type is a primitive value, you can create a B-Tree Index.
  • Index 11: Creates a BRIN Index with the value range being only the day part, rather than taking the hour, minute, second and timezone parts

-- Query 1
SELECT email FROM test WHERE LOWER(email) = 'adrian.gutmann@gmail.com';

-- Query 2
SELECT * FROM test WHERE TRIM(username) = 'Kent78';

-- Query 3
SELECT * FROM test WHERE (first_name || ' ' || last_name) = 'Damon Kessler';

-- Query 4
SELECT * FROM test WHERE DATE_TRUNC('day', created_at AT TIME ZONE 'UTC') = '2026-06-19';

-- Query 5
SELECT * FROM test WHERE EXTRACT(MONTH FROM created_at AT TIME ZONE 'UTC') = 6;

-- Query 6
SELECT * FROM test WHERE (price * (1 - discount_rate)) < 200.00;

-- Query 7
SELECT * FROM test WHERE md5(secret_key) = '3b27b3d9f412f9bab5053ce12c01e9ef';

-- Query 8
SELECT * FROM test WHERE ((meta_value::INTEGER)) = 661807;

-- Query 9
SELECT * FROM test WHERE to_tsvector('english', title || ' ' || content) @@ to_tsquery('english', 'ubi & voro');

-- Query 10
SELECT * FROM test WHERE (metadata->'info'->'tags') @> '"postgres"'::jsonb;

-- Query 11
SELECT * FROM test WHERE DATE_TRUNC('day', created_at AT TIME ZONE 'UTC') BETWEEN '2026-06-01' AND '2026-06-19';
  • Query 1: Searches by email, case-insensitive
    • Normally when using an Expression Index, because the Index only stores the value after transformation, there will be no raw data of the column, so when executing any query, it is an Index Scan (including scan in Index and Heap scan)
    • Because when creating the Index for this column above, INCLUDE (email) was added, this Index already contains the raw value of the email
    • Therefore, when querying this field, you will see it is an Index Only Scan, which means no Heap scan is needed anymore
    • But other queries below only have an Index Scan
  • Query 2: Searches for username, automatically removing extra whitespaces
  • Query 3: Concatenates strings to search by full name
  • Query 4: When using 2026-06-19 to query, Postgres will automatically convert it to 2026-06-19 00:00:00 before comparing
    • The DATE_TRUNC('day', created_at AT TIME ZONE 'UTC') part will not have to convert the values in the created_at column anymore (since it was done when creating the Index), this is just the syntax to identify which Index needs to be used
  • Query 5: Filters records created in June
  • Query 6: Finds products with a post-discount price cheaper than 200.00
  • Query 7: Authenticates Token using MD5 hash string
    • The value passed in 3b27b3d9f412f9bab5053ce12c01e9ef is the result after hashing with MD5
  • Query 8: Searches after casting to Integer type
  • Query 9: Searches content where the concatenation result of title and content must simultaneously contain both words ubi and voro
  • Query 10: Finds JSONB records containing the element postgres located inside the tags array.
  • Query 11: Searches by date range


The result is as follows





Happy coding!

See more articles here.

Comments

Popular posts from this blog

All Practice Series

Kubernetes Deployment for Zero Downtime

Deploying a NodeJS Server on Google Kubernetes Engine

Sitemap

Setting up Kubernetes Dashboard with Kind

React Practice Series

Monitoring with cAdvisor, Prometheus and Grafana on Docker

DevOps Practice Series

Helm for beginer - Deploy nginx to Google Kubernetes Engine

Using Kafka with Docker and NodeJS