crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: Sort with multiple reducers not working?
Date Wed, 31 Jul 2013 19:15:51 GMT
I was just playing around with the HFile output format patch and ran into
this same issue (without realizing that this was the problem), and then
finally made the link with this.

The one way we could test things like this is using a MiniMRCluster, which
is actually accessible via the HBaseTestingUtility. That way we could start
up a "real" cluster that doesn't run in local mode, and then we could test
things like multiple regions here, as well as the sorting code. The
drawback is that it slows down the test code, but seeing as we're already
starting up a mini HBase cluster for the HBase tests then I think that's
probably acceptable.

- Gabriel


On Wed, Jul 31, 2013 at 5:17 PM, Josh Wills <josh.wills@gmail.com> wrote:

> Not that I know of.
>
>
> On Tue, Jul 30, 2013 at 11:54 PM, Chao Shi <stepinto@live.com> wrote:
>
> > Got it. I have to test my patch on a real cluster manually and it works.
> Is
> > there any way to do it in unit test?
> >
> >
> > On Tue, Jul 30, 2013 at 11:32 PM, Josh Wills <jwills@cloudera.com>
> wrote:
> >
> > > Hey Chao,
> > >
> > > It's just a problem w/the LocalJobRunner, which always uses a single
> > > reducer no matter what you set it to in the configuration.
> > >
> > > J
> > >
> > >
> > > On Tue, Jul 30, 2013 at 1:06 AM, Chao Shi <stepinto@live.com> wrote:
> > >
> > > > Hi devs,
> > > >
> > > > Does any one tried sorting with multiple reducers? I seem to hit this
> > > when
> > > > trying to implement the HFile bulk loader.
> > > >
> > > > You can reproduce this as follow:
> > > > 1. modify SortIT to run multiple reducers
> > > > 2. run SortIT#testWritableSortDesc
> > > >
> > > > I got exception:
> > > > java.lang.IllegalArgumentException: Can't read partitions file
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.crunch.lib.sort.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:81)
> > > >         at
> > > >
> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:677)
> > > >         at
> > > org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> > > >         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> > > >         at
> > > >
> > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:214)
> > > > Caused by: java.io.IOException: Wrong number of partitions in keyset
> > > >         at
> > > >
> > > >
> > >
> >
> org.apache.crunch.lib.sort.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:77)
> > > >         ... 6 more
> > > >
> > > > It seems that TotalOrderPartitioner does not receive the correct
> number
> > > of
> > > > reducers. Any ideas?
> > > >
> > > > Thanks,
> > > > Chao
> > > >
> > >
> > >
> > >
> > > --
> > > Director of Data Science
> > > Cloudera <http://www.cloudera.com>
> > > Twitter: @josh_wills <http://twitter.com/josh_wills>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message