crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sandy Ryza <sandy.r...@cloudera.com>
Subject Re: output records counter
Date Fri, 07 Jun 2013 18:13:41 GMT
Ok, makes sense, thanks Gabriel


On Fri, Jun 7, 2013 at 2:20 AM, Gabriel Reid <gabriel.reid@gmail.com> wrote:

> Hi Sandy,
>
> Crunch uses something similar to Hadoop's MultipleOutputFormat to allow
> writing multiple outputs in multiple formats from the same job. This leads
> to different counters being used for output, as there can be multiple
> outputs (and therefore multiple counters) from a single job.
>
> The main implementation class of this is o.a.c.io.CrunchOutputs, and the
> counters that contain the actual output count are published in the counter
> group with the name of that class, and the counter name of out<d>, where
> <d> is the index of the output for the job (i.e. starting from 0).
>
> - Gabriel
>
>
>
> On Fri, Jun 7, 2013 at 10:54 AM, Sandy Ryza <sandy.ryza@cloudera.com>wrote:
>
>> Hey All,
>>
>> Does Crunch not use the normal MR channels for outputting stuff?  I'm
>> noticing that when I look at a job's Counters, the output records are
>> always 0, even when I know data has been written.
>>
>> thanks
>> -Sandy
>>
>
>

Mime
View raw message