hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gaurav Veda" <gve...@cs.cmu.edu>
Subject Difference between Hadoop Streaming and "Normal" mode
Date Tue, 12 Aug 2008 22:09:32 GMT
Hi All,

This might seem too silly, but I couldn't find a satisfactory answer
to this yet. What are the advantages / disadvantages of using Hadoop
Streaming over the normal mode (wherein you write your own mapper and
reducer in Java)? From what I gather, the real advantage of Hadoop
Streaming is that you can use any executable (in c / perl / python
etc) as a mapper / reducer.
A slight disadvantage is that the default is to read (write) from the
standard input (output) ... though one can specify their own Input and
Output format (and package it with the default hadoop streaming jar

My point is, why should I ever use the normal mode? Streaming seems
just as good. Is there a performance problem or do I have only limited
control over my job if I use the streaming mode or some other issue?

Share what you know, learn what you don't !

View raw message