Posts

Showing posts with the label multiple columns

Join with Multiple Column Comparisons

Image
Introduction When performing a table join that compares multiple columns simultaneously (such as ON A.col1 = B.col1 AND A.col2 = B.col2), PostgreSQL will still use familiar algorithms like Hash Join, Merge Join or Nested Loop Join. However, the way these algorithms process multiple columns involves very distinctive strategies as follows. Hash Join This is usually the number one choice for Postgres when joining multiple columns on large tables. Instead of hashing each individual column, Postgres performs string concatenation to hash the combination of all those columns at the same time. Build Phase: Postgres takes the values of all participating join columns and groups them together (like combining value = col1 + col2 + col3), then inputs this entire combined string into the Hash Function to calculate the Hash Number to be placed into the Bucket. Probe Phase: Next, it also combines the corresponding columns of the remaining table to hash similarly and then compares it with the Hash Tabl...