Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B7F4D19B8F for ; Thu, 24 Mar 2016 17:37:25 +0000 (UTC) Received: (qmail 43374 invoked by uid 500); 24 Mar 2016 17:37:25 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 43280 invoked by uid 500); 24 Mar 2016 17:37:25 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 43233 invoked by uid 500); 24 Mar 2016 17:37:25 -0000 Delivered-To: apmail-incubator-crunch-dev@incubator.apache.org Received: (qmail 43229 invoked by uid 99); 24 Mar 2016 17:37:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Mar 2016 17:37:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 794A52C1F5A for ; Thu, 24 Mar 2016 17:37:25 +0000 (UTC) Date: Thu, 24 Mar 2016 17:37:25 +0000 (UTC) From: "Josh Wills (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CRUNCH-598) scaleFactor for JoinStrategy MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15210610#comment-15210610 ] Josh Wills commented on CRUNCH-598: ----------------------------------- [~desmit] what do you imagine as a fix here? A constructor argument for DefaultJoinStrategy (and possibly ShardedJoinStrategy?) > scaleFactor for JoinStrategy > ---------------------------- > > Key: CRUNCH-598 > URL: https://issues.apache.org/jira/browse/CRUNCH-598 > Project: Crunch > Issue Type: Improvement > Reporter: Stefan De Smit > Priority: Minor > > the scaleFactor method has a big influence on planner. > For joins, there currently isn't a clean way to set this, while it often is required, as a join can have a big multiply factor. > for the DefaultJoinStrategy, it's possible to add a custom JoinFn with proper scaleFactor, or just extend the default InnerJoinFn with a scaleFactor. > For the ShardedJoinStrategy, this isn't possible, while it often is needed more (as ShardedJoin is especially handy for 1 to really many). > For the default ConstantShardingStrategy, it might make sense to use the numShards also as scalingFactor for left side. as that's kind of what happens: emit every left entry numShards times. -- This message was sent by Atlassian JIRA (v6.3.4#6332)