crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Clément MATHIEU (JIRA) <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-579) Support Counters from Custom RecordWriters
Date Tue, 19 Apr 2016 19:37:25 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15248468#comment-15248468
] 

Clément MATHIEU commented on CRUNCH-579:
----------------------------------------

I was facing the same issue last week. Downloaded the patch, back-ported it to crunch-0.11.0-cdh5.4.2
/ Hadoop 2. It works fine and we are now using it in production.

It would be great if this patch could be merged. I don't see why it would not work on master,
but can give it a try if really needed.

> Support Counters from Custom RecordWriters
> ------------------------------------------
>
>                 Key: CRUNCH-579
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-579
>             Project: Crunch
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Micah Whitacre
>            Assignee: Josh Wills
>         Attachments: CRUNCH-579.patch
>
>
> A consumer mentioned this on the mailing list:
> {quote}
> So we wrote a custom Crunch Target to write data to a particular location which included
providing RecordWriters, OutputFormat, OutCommitters and such. Now we  wanted to add Counters
to get a count of how much data our reducers are writing. A most obvious design was to use
the TaskAttemptContext that gets passed to the RecordWriter and use the getCounter() method
on it to manipulate the counters. However, that did not work as expected and even though the
counters were getting incremented, they did not show up on the Resource Manager UI. On further
investigation I found that, cruch mangles this context object to add in a named output via
the CrunchOutputs class[1] which basically makes the counters useless within the recordwriter
class. 
> Would it be a feasible enhancement for the CrunchOutputs to pass in the original base
context object along with the modified one with named outputs so that it can be used for counters?
Any other suggestions are most welcome as well.
> {quote}
> http://mail-archives.apache.org/mod_mbox/crunch-user/201511.mbox/%3CCAHnpetQpcSqFhWFZ9ZJg6DkN02jeC%3DLpvJ0%2BVSP%2BoA%2B8c0DK%2Bw%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message