crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <>
Subject Re: Support OutputCommitter?
Date Thu, 27 Feb 2014 10:48:49 GMT
Hi Chao,

Crunch doesn't call the output committer explicitly itself, it's
called by the MR framework as a normal part of running a job. However,
in Crunch's MapReduceTarget#configureForMapReduce the output format is
not typically set for the named-output case (which is the only case
that is executed now, as I discovered in the thread mentioned below),
so it defaults to FileOutputFormat, with its semantics. (This is why
HBaseTarget calls FileOutputFormat.setOutputPath, which it wouldn't
have to if it set the output format explicitly to HBase's

Are you setting the HCatOutputFormat in the named-output case? In the
Crunch Target I'm writing I've set the OutputFormat explicitly:


On Thu, Feb 27, 2014 at 7:54 AM, Gabriel Reid <> wrote:
> For reference, here's the link to the previous thread on this:
> On Thu, Feb 27, 2014 at 7:56 AM, Josh Wills <> wrote:
>> +tom
>> Didn't Tom have a thing like this a little while ago?
>> On Wed, Feb 26, 2014 at 8:04 PM, Chao Shi <> wrote:
>>> Hi crunch devs,
>>> I'm developing target wrapper for HCatOutputFormat, which uses a custom
>>> OutputCommiter to get results committed to hive. It seems its
>>> OutputCommitter is not called at all. Looking into the code, I can't find
>>> where crunch calls it. Is it really supported?
>>> Thanks,
>>> Chao
>> --
>> Director of Data Science
>> Cloudera <>
>> Twitter: @josh_wills <>

View raw message