Hash join - Deepstash

Hash join

How: First, it builds a hash table from the smaller table, where the key is the JOIN key and the value is the necessary tuple data. Then, it scans the other table, hashing the JOIN key of each row and emitting the rows that are also present in the hash table.

When: It's used for joining large datasets with equality conditions, that can fit into the working memory (work_mem).

Tip: Ensure you have enough working memory to avoid hash table spilling to disk.

Complexity:

  • O(M+N) time. O(M) space
  • O(M) to build the hash table + O(N) to scan the other table

14

25 reads

CURATED FROM

IDEAS CURATED BY

ocpodariu

Alt account of @ocp. I use it to stash ideas about software engineering

Tips to improve the performance of your SQL queries

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates