crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Östlund <ostlund.j...@gmail.com>
Subject Adding the ability to choose my own number of reducers for Sharded Join
Date Thu, 10 Mar 2016 17:30:50 GMT
Hey,

I would like to add the ability to choose the amount of reducers that can
be used with the ShardedJoinStrategy. Currently, only the default number is
chosen (500), this causes a lot of problems in my pipelines and will be a
much slower alternative than using the DefaultJoinStrategy (for cases where
I need around 5000 reducers). Due to the large data amount that needs to go
through 500 reducers. I Have already opened a pull request a while ago, but
I am willing to follow your structure and opening a JIRA ticket and then
submit it according to the official process, if you guys think it is a good
idea. I think this would increase the performance in crunch when using
large data sets.

Basically this is what I want  to add:

https://github.com/apache/crunch/pull/8
---
Thanks!

Joel Östlund

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message