chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jerome Boulon <jbou...@yahoo-inc.com>
Subject Re: Gotcha's in writing ChukwaRecords using ChukwaRecordOutputFormat?
Date Thu, 04 Jun 2009 17:41:38 GMT
That's the standard Demux behavior.  Demux goup all records for the same recordType to the
same reducer.

If you want to use more than one reducer/writer for the same recordType then you need to change
the naming convention to include the leaf filename.
There's an example on how to do this in this class: ChukwaArchiveStreamNameOutputFormat.

/Jerome.

On 6/4/09 10:17 AM, "Jiaqi Tan" <tanjiaqi@gmail.com> wrote:

Hi,

I changed my number of reducers from 2 to 1 and that fixed the
problem. Does the MultipleOutputFormat require that writes to the same
file be performed by only one reduce?

Jiaqi

On Thu, Jun 4, 2009 at 9:52 AM, Jiaqi Tan <tanjiaqi@gmail.com> wrote:
> Hi,
>
> I'm having weird behaviour in writing ChukwaRecords using the
> ChukwaRecordOutputFormat. I wrote 200 records out, and Hadoop reports
> that I had a total of 200 output records from my job, but when I used
> bin/dumpRecord.sh on the output file, I had only 100 records in the
> SequenceFile. Any idea what the problem could be?
>
> I've checked the values used for the ChukwaRecordKey's, and they're
> all unique strings.
>
> Thanks,
> Jiaqi
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message