hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Robinson <hadoopmich...@gmail.com>
Subject Re: calling C programs from Hadoop
Date Mon, 31 May 2010 16:47:44 GMT


Reading "Hadoof Streaming" I found the following:

"How Does Streaming Work
In the above example, both the mapper and the reducer are executables that
read the input from stdin (line by line) and emit the output to stdout. The
utility will create a Map/Reduce job, submit the job to an appropriate
cluster, and monitor the progress of the job until it completes.

I am beginning to think that my understanding of map/reduce is faulty. At
this time I understand that the mapper takes in data and splits it into
chunks creating lists of  (<key>, <values>), then it combines this output
and sends the result to the reducer.

The C program I have reads each line in the input file and searches a master
file looking for exact and similar matches then it does computations bases
on how similar the results are, so there is no need for creating <key>,
<values> lists.

Thanks very much

View this message in context: http://lucene.472066.n3.nabble.com/calling-C-programs-from-Hadoop-tp854833p859041.html
Sent from the Hadoop lucene-users mailing list archive at Nabble.com.

View raw message