hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Enis Soztutar <enis....@gmail.com>
Subject Re: Counting no. of keys.
Date Mon, 03 Aug 2009 12:53:34 GMT
prashant ullegaddi wrote:
> Hi,
> I've say 800 sequence files written using SequenceFileOutputFormat. Is there
> any way to know
> no. of unique keys in those sequence files?
> Thanks,
> Prashant.
You can use the counters "map output records" and "reduce output 
records" for this. If you can guarantee that every output key from 
reduce is unique, then the reduce output records is what you're looking 
for. If you're not using the reduce phase, then use map output records.

View raw message