incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gabriel Reid (JIRA)" <>
Subject [jira] [Updated] (CRUNCH-3) Replicated ("map-side") joins
Date Sat, 07 Jul 2012 05:58:33 GMT


Gabriel Reid updated CRUNCH-3:

    Attachment: mapside-joins.patch

Patch to implement map side joins -- added after the commit was done, only for completeness.
> Replicated ("map-side") joins
> -----------------------------
>                 Key: CRUNCH-3
>                 URL:
>             Project: Crunch
>          Issue Type: New Feature
>          Components: MapReduce Patterns
>            Reporter: Josh Wills
>            Assignee: Gabriel Reid
>         Attachments: mapside-joins.patch
> Replicated joins are a common way to improve performance when joining a large dataset
with a small one. The smaller dataset is loaded into memory in the mapper/reducer tasks, and
is then joined with the larger dataset as the large one is processed by the MapReduce job

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message