crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriel Reid <gabriel.r...@gmail.com>
Subject Re: output records counter
Date Fri, 07 Jun 2013 09:20:49 GMT
Hi Sandy,

Crunch uses something similar to Hadoop's MultipleOutputFormat to allow
writing multiple outputs in multiple formats from the same job. This leads
to different counters being used for output, as there can be multiple
outputs (and therefore multiple counters) from a single job.

The main implementation class of this is o.a.c.io.CrunchOutputs, and the
counters that contain the actual output count are published in the counter
group with the name of that class, and the counter name of out<d>, where
<d> is the index of the output for the job (i.e. starting from 0).

- Gabriel



On Fri, Jun 7, 2013 at 10:54 AM, Sandy Ryza <sandy.ryza@cloudera.com> wrote:

> Hey All,
>
> Does Crunch not use the normal MR channels for outputting stuff?  I'm
> noticing that when I look at a job's Counters, the output records are
> always 0, even when I know data has been written.
>
> thanks
> -Sandy
>

Mime
View raw message