Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 89E071064A for ; Fri, 7 Jun 2013 20:50:21 +0000 (UTC) Received: (qmail 23226 invoked by uid 500); 7 Jun 2013 20:50:21 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 23160 invoked by uid 500); 7 Jun 2013 20:50:21 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 23152 invoked by uid 500); 7 Jun 2013 20:50:21 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 23109 invoked by uid 99); 7 Jun 2013 20:50:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Jun 2013 20:50:21 +0000 Date: Fri, 7 Jun 2013 20:50:21 +0000 (UTC) From: "Gabriel Reid (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (CRUNCH-213) Add sharded join functionality MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Gabriel Reid created CRUNCH-213: ----------------------------------- Summary: Add sharded join functionality Key: CRUNCH-213 URL: https://issues.apache.org/jira/browse/CRUNCH-213 Project: Crunch Issue Type: New Feature Reporter: Gabriel Reid Assignee: Gabriel Reid Performing joins where a large proportion of the values on one or both sides of the join are mapped to a single key can result in poor performance, as one (or a small number) of reducers end up handling most of the joining work, leaving the rest of the cluster idle. Sharded joining should be added to allow splitting up join keys, thereby distributing values mapped to a single key over multiple reducer partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira