hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Chaining MapReduce operations
Date Wed, 25 Oct 2006 20:51:06 GMT
David Pollak wrote:
> Howdy,
>
> Can anyone point me to an example of chaining MapReduce operations 
> (e.g., Word Count followed by inverting the "Word" and "Count" columns 
> so one wound up with the list sorted by count rather than by word)?

Nutch uses this in several places, e.g.:

org.apache.nutch.crawl.Generator
org.apache.nutch.crawl.CrawlDbReader
org.apache.nutch.crawl.Injector
org.apache.nutch.indexer.DeleteDuplicates
...

You simply switch the key and value in your map() method, just remember 
to set the output classes for keys and values accordingly.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Mime
View raw message