crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <josh.wi...@gmail.com>
Subject Re: Adding the ability to choose my own number of reducers for Sharded Join
Date Thu, 10 Mar 2016 18:33:00 GMT
Hey Joel,

I'm sorry I missed this before; we rarely get pull requests for historical
reasons (i.e., we're all really old.) Patch looks good, I'll do the merge
now.

J

On Thu, Mar 10, 2016 at 9:30 AM, Joel Östlund <ostlund.joel@gmail.com>
wrote:

> Hey,
>
> I would like to add the ability to choose the amount of reducers that can
> be used with the ShardedJoinStrategy. Currently, only the default number is
> chosen (500), this causes a lot of problems in my pipelines and will be a
> much slower alternative than using the DefaultJoinStrategy (for cases where
> I need around 5000 reducers). Due to the large data amount that needs to go
> through 500 reducers. I Have already opened a pull request a while ago, but
> I am willing to follow your structure and opening a JIRA ticket and then
> submit it according to the official process, if you guys think it is a good
> idea. I think this would increase the performance in crunch when using
> large data sets.
>
> Basically this is what I want  to add:
>
> https://github.com/apache/crunch/pull/8
> ---
> Thanks!
>
> Joel Östlund
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message