Standard View and Materialized View

Introduction

View

  • Also known as Standard View or Virtual Table, this is essentially just an SQL query statement saved under a specific name.
  • It does not store any data on the disk, it only stores the definition (SQL code block) within the system.
  • Every time you execute SELECT * FROM my_view, Postgres runs the underlying SQL definition of that View directly against the base tables.
  • The data is always up to date, if the source tables change, the View immediately reflects those modifications.
  • Performance depends entirely on the original SQL query, if the original query is complex involving multiple table JOIN operations or AGGREGATE calculations, the View will run slowly because Postgres must recompute everything from scratch upon every invocation.

Materialized View

  • This is a View but it physically stores data on the disk just like a regular table.
  • It saves the entire result of the query onto the disk at the moment it is created or refreshed.
  • When you execute SELECT * FROM my_mat_view, Postgres reads the pre-saved data from the disk like a normal table, without needing to re-run complex SQL queries or reconnect the base tables.
  • It does not update automatically, data is frozen at the time of creation or refreshment. If the base tables change, the Materialized View becomes stale until you actively execute a command to refresh it.
  • Query speed is exceptionally fast for complex computations, reports or data aggregation, because the data is pre-calculated, it essentially operates as a caching mechanism at the database layer.
  • You can create indexes on a Materialized View to accelerate search speeds, which is something a standard View cannot achieve.

Refresh Methods

Postgres provides the REFRESH MATERIALIZED VIEW command with two main operational mechanisms:

  • Refresh with Lock
    • Postgres will delete all old data in the View and re-run the original SQL query to load fresh data.
    • This command applies an Exclusive Lock on the Materialized View, during the refresh process which can take from several seconds to a few minutes depending on the volume of data, you cannot execute SELECT statements on this View, user queries will hang until the refresh completes.
  • Concurrent Refresh
    • Postgres compares the old data with the new data, then only applies the changes (INSERT, UPDATE, DELETE) to the View.
    • Its advantage is that it does not require locking the table, users can still execute SELECT statements to read data normally while the system is performing the refresh.
    • The mandatory condition to use CONCURRENTLY is that your Materialized View must have at least one UNIQUE INDEX on one column or a group of columns.

Refresh Strategies

Since Postgres does not automatically refresh when the base tables change, you need to set up the execution timing based on your project requirements:

  • Scheduled / Cron Job
    • This is the most popular approach for reporting systems like dashboards and analytics, where data does not need to be real-time every second and a certain delay is acceptable.
    • Use external scheduling tools such as Linux Cron, Airflow or the pg_cron extension running directly inside Postgres to automatically invoke the refresh command.
    • Recommended frequencies:
      • Every night at 2:00 AM for financial reports or daily summaries.
      • Every 15 to 30 minutes for dashboard charts tracking progress.
  • Event-Driven / Trigger
    • If you want the Materialized View to update immediately or almost immediately when the base tables change, you can use a TRIGGER.
    • Write a trigger function attached to the base table, every time the base table encounters an INSERT, UPDATE, DELETE action, this trigger will automatically invoke the REFRESH command.
    • This method should only be used when the write frequency to the base table is minimal, if the base table is written to continuously hundreds of times per second, the trigger will force Postgres to refresh constantly, leading to a bottleneck and severely impacting system performance.
  • On-Demand
    • Provide a Refresh button on the Admin or Dashboard user interface, when the user clicks it, the application sends a REFRESH command to the database.
    • Suitable for heavy reports where users decide for themselves when they need the latest figures for analysis.

Use cases

  • Choose Standard View when: You need security to restrict access permissions, cleaner code structure, and the data in the base tables changes continuously but you must retrieve accurate real-time data every second.
  • Choose Materialized View when: You need to build complex statistical reports such as total monthly revenue or daily user statistics, this data does not need to be accurate to the exact second, you can utilize an additional cronjob to REFRESH it every hour or every night.

Detail

First, create the table as follows:

CREATE TABLE categories (
    category_id SERIAL PRIMARY KEY,
    category_name VARCHAR(100) NOT NULL
);

CREATE TABLE products (
    product_id SERIAL PRIMARY KEY,
    product_name VARCHAR(255) NOT NULL,
    price NUMERIC(12, 2) NOT NULL,
    category_id INT REFERENCES categories(category_id)
);

CREATE TABLE orders (
    order_id SERIAL PRIMARY KEY,
    product_id INT REFERENCES products(product_id),
    quantity INT NOT NULL,
    total_amount NUMERIC(12, 2) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

Then, you create and use a Standard View like this:

CREATE OR REPLACE VIEW view_realtime_category_sales AS
SELECT 
    c.category_id,
    c.category_name,
    COUNT(o.order_id) AS total_orders,
    SUM(o.quantity) AS total_products_sold,
    SUM(o.total_amount) AS total_revenue
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN orders o ON p.product_id = o.product_id
GROUP BY c.category_id, c.category_name;

SELECT * FROM view_realtime_category_sales;

When using it, just execute a standard SELECT statement concisely to get the latest data from that View.



And create a Materialized View like this:

CREATE MATERIALIZED VIEW mv_category_sales_analytics AS
SELECT 
    c.category_id,
    c.category_name,
    COUNT(o.order_id) AS total_orders,
    SUM(o.quantity) AS total_products_sold,
    SUM(o.total_amount) AS total_revenue
FROM categories c
JOIN products p ON c.category_id = p.category_id
JOIN orders o ON p.product_id = o.product_id
GROUP BY c.category_id, c.category_name;

CREATE UNIQUE INDEX idx_mv_cat_sales_id ON mv_category_sales_analytics (category_id);

SELECT * FROM mv_category_sales_analytics;

REFRESH MATERIALIZED VIEW mv_category_sales_analytics;

REFRESH MATERIALIZED VIEW CONCURRENTLY mv_category_sales_analytics;
  • You need to create a UNIQUE INDEX to be able to refresh with CONCURRENTLY.
  • When executing a select query, it will be extremely fast because all data is already saved, but if the source data changes, this View does not change automatically.
  • Manual refresh is required when you want to update the View.


Happy coding!

See more articles here.

Comments

Popular posts from this blog

All Practice Series

Kubernetes Deployment for Zero Downtime

Deploying a NodeJS Server on Google Kubernetes Engine

Sitemap

React Practice Series

A Handy Guide to Using Dynamic Import in JavaScript

Helm for beginer - Deploy nginx to Google Kubernetes Engine

DevOps Practice Series

Docker Practice Series

Setting up Kubernetes Dashboard with Kind