hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter W. <pe...@marketingbrokers.com>
Subject Re: DecreasingComparator
Date Mon, 02 Jul 2007 21:04:42 GMT
Devaraj,

You are correct that I wanted to order by url count.

The partition file output of a Hadoop task  seems to be a dump
of the key value pairs so generally I'm interested in expressing
Collections.sort(value) in either the map or reduce method.

Regards,

Peter W.



On Jul 1, 2007, at 9:39 PM, Devaraj Das wrote:

> Your first MapReduce phase is very similar to the WordCount  
> example. The
> only difference is that you need to create LongWritable objects for  
> the
> values. The output format should be SequenceFileOutputFormat.class.
>
> Run a subsequent MapReduce phase with the input format set to
> SequenceFileInputFormat.class, the map class set to  
> InverseMapping.class,
> and, the OutputKeyComparator set to  
> LongWritable.DecreasingComparator.class.
>
>
> By the way, the 2nd mapreduce phase won't work unless you patch  
> your version
> of hadoop with
> https://issues.apache.org/jira/secure/attachment/ 
> 12360717/1535_01.patch .
> This hasn't been committed yet.
>
> -----Original Message-----
> From: Peter W. [mailto:peter@marketingbrokers.com]
> Sent: Monday, July 02, 2007 6:08 AM
> To: hadoop-user@lucene.apache.org
> Subject: DecreasingComparator
>
> Hello,
>
> I have a modified WordCount program with the following  
> characteristics:
>
> input file:
> urla.com,urlb.com
> urla.com,urlc.com
> urlb.com,urlc.com
> urlc.com,urla.com
> urld.com,urlc.com
>
> mapreduce output:
> urla.com 3
> urlb.com 2
> urlc.com 4
> urld.com 1
>
> Next, tried using a comparator with a different JobConf and mapreduce:
>
> jc.setOutputKeyComparatorClass 
> (LongWritable.DecreasingComparator.class);
>
> but it didn't work because the values are IntWritable and my  
> OutputCollector
> wasn't picking up the right things...
>
> What do I need to collect in both the map and reduce for the final  
> result to
> sort descending high-low?
>
> Thanks,
>
> Peter W.
>


Mime
View raw message