hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Lewis <lordjoe2...@gmail.com>
Subject Approached to combing the output of reducers
Date Sat, 23 Oct 2010 22:08:18 GMT
Once I run a map-reduce job I get output in the form of
part-r-00000 part-r-00001 ...

In many cases the output is significantly smaller than the original input -
take the classic word count

In most cases I want to combine the output into a single file that may well
not live on HDFS but on a more accessible file system

Are there standard libraries or approaches for consolidating reducer
output.

A second Map-Reduce job taking the output directory as an input is an OK
start but as output there needs to be a single reducer that
writes a real file and not reduce output -

Are there standard libraries or approaches to this?????

-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message