Posts

Showing posts with the label query optimization

Using EXISTS and NOT EXISTS

Image
Introduction The EXISTS clause is used to solve a single problem in data logic which is checking the existence of data based on a condition, without worrying about how many times it exists or what the specific value is. How it works Just like when using IN with EXISTS , Postgres will also automatically use Semi-Join algorithms (or Anti-Semi-Join for NOT EXISTS ) Semi Join (EXISTS) : Returns rows from the main table (left table) if there is at least one matching row in the sub-table. The optimizer stops searching as soon as it finds the first matching row for each record of the main table. Anti Join (NOT EXISTS) : Contrary to Semi Join , it returns rows from the main table if no matching rows are found in the sub-table. It still uses popular algorithms such as Nested Loop Semi Join, Hash Semi Join, Merge Semi Join with a Short-circuit mechanism, meaning that for each row value, it only needs to iterate until the first value is found, rather than finding all matching values like wh...

Join with Multiple Column Comparisons

Image
Introduction When performing a table join that compares multiple columns simultaneously (such as ON A.col1 = B.col1 AND A.col2 = B.col2), PostgreSQL will still use familiar algorithms like Hash Join, Merge Join or Nested Loop Join. However, the way these algorithms process multiple columns involves very distinctive strategies as follows. Hash Join This is usually the number one choice for Postgres when joining multiple columns on large tables. Instead of hashing each individual column, Postgres performs string concatenation to hash the combination of all those columns at the same time. Build Phase: Postgres takes the values of all participating join columns and groups them together (like combining value = col1 + col2 + col3), then inputs this entire combined string into the Hash Function to calculate the Hash Number to be placed into the Bucket. Probe Phase: Next, it also combines the corresponding columns of the remaining table to hash similarly and then compares it with the Hash Tabl...