Merge join - Deepstash

Merge join

How: Sorts both tables by the join condition and merges the results like a zipper. For very large datasets, it splits the dataset into chunks, sorts each chunk in memory, writes each sorted chunk into a temporary file on disk, and finally does a multi-way merge of the sorted files on disk.

When: It's used when at least one input dataset is pre-sorted or for datasets larger than the working memory.

Tip: Ensure datasets are pre-sorted or can be sorted efficiently.

Complexity:

  • for pre-sorted inputs: O(M+N) time. O(1) space
  • otherwise: O(MlogM + NlogN) time. O(M+N) space

15

24 reads

CURATED FROM

IDEAS CURATED BY

ocpodariu

Alt account of @ocp. I use it to stash ideas about software engineering

Tips to improve the performance of your SQL queries

Read & Learn

20x Faster

without
deepstash

with
deepstash

with

deepstash

Personalized microlearning

100+ Learning Journeys

Access to 200,000+ ideas

Access to the mobile app

Unlimited idea saving

Unlimited history

Unlimited listening to ideas

Downloading & offline access

Supercharge your mind with one idea per day

Enter your email and spend 1 minute every day to learn something new.

Email

I agree to receive email updates