crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dierickx Dominique <d.dieri...@gmail.com>
Subject Re: Too many Crunch output counters
Date Mon, 20 Jan 2014 13:02:55 GMT
OK now I am subscribed! @Josh I will try and implement this this week and
submit a patch in the Jira tracker. Thanks.


2014/1/20 Stefan De Smit <stefan.desmit@gmail.com>

> That's what he wants, but he can't reply as he wasn't yet subscribed to the
> mailing list.
>
>
> On Thu, Jan 16, 2014 at 7:27 PM, Josh Wills <jwills@cloudera.com> wrote:
>
> > That seems relatively benign; do you need a crunch param that would
> control
> > the usage of the named outputs counter?
> >
> >
> > On Thu, Jan 16, 2014 at 8:40 AM, Dierickx Dominique <
> d.dierickx@gmail.com
> > >wrote:
> >
> > > We're having some trouble with the amount of counters that Crunch
> creates
> > > when writing to a lot of different output files (slightly more than
> 120).
> > > This wouldn't be an issue if we were able to configure the maximum
> number
> > > of allowed counters but unfortunately, because we are running an older
> > > version of Hadoop, doing this is not an option and we are required to
> > patch
> > > Crunch locally when using a new release to leave out the counters. The
> > > required patch (one line...) can be found in the attachment.
> > >
> > > I'm not saying the counters should be removed but maybe it is an option
> > to
> > > make them configurable without paying too much of a performance
> penalty?
> > >
> > > Regards,
> > > Dominique Dierickx
> > >
> > >
> >
> >
> > --
> > Director of Data Science
> > Cloudera <http://www.cloudera.com>
> > Twitter: @josh_wills <http://twitter.com/josh_wills>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message