crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "kulkarni.swarnim@gmail.com" <kulkarni.swar...@gmail.com>
Subject Custom Crunch Target and Counters
Date Fri, 20 Nov 2015 06:39:41 GMT
Hello,

So we wrote a custom Crunch Target to write data to a particular location
which included providing RecordWriters, OutputFormat, OutCommitters and
such. Now we  wanted to add Counters to get a count of how much data our
reducers are writing. A most obvious design was to use the
TaskAttemptContext that gets passed to the RecordWriter and use the
getCounter() method on it to manipulate the counters. However, that did not
work as expected and even though the counters were getting incremented,
they did not show up on the Resource Manager UI. On further investigation I
found that, cruch mangles this context object to add in a named output via
the CrunchOutputs class[1] which basically makes the counters useless
within the recordwriter class.

Would it be a feasible enhancement for the CrunchOutputs to pass in the
original base context object along with the modified one with named outputs
so that it can be used for counters? Any other suggestions are most welcome
as well.

Thanks,
Swarnim

[1]
https://github.com/cloudera/crunch/blob/cdh5.4.8-release/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java#L210-L232

Mime
View raw message