Posts

Showing posts with the label ram

Hash Join

Image
Introduction In this article, we will learn about Hash Join, which is the preferred algorithm when joining two large tables without indexes, or when the data volume exceeds the optimization capability of a Nested Loop. How It Works Step 1: Build Phase First, PostgreSQL selects the smaller table to load into memory (RAM). Next, it takes each item and passes it into a Hash Function to receive a hash number. Postgres relies on this value to sort the results into the corresponding bucket, resulting in the creation of a Hash Table stored in RAM. Step 2: Probe Phase Now, PostgreSQL scans the larger table, also taking each item to pass into the Hash Function just like in step 1. Once the hash number is obtained, it only needs to access the corresponding bucket in the Hash Table to check. Hash Collision When using a Hash Function, there are still cases where two different values produce the same result, which is called a Hash Collision. In this case, values with the same Hash Number will resid...