hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Premal Shah <premal.j.s...@gmail.com>
Subject Hadoop Streaming Combiner Problem
Date Thu, 04 Aug 2011 20:34:48 GMT
According to the hadoop streaming
docs<http://hadoop.apache.org/common/docs/r0.20.0/streaming.html#Working+with+the+Hadoop+Aggregate+Package+%28the+-reduce+aggregate+option%29>,
there is an inbuilt Aggregate Java class which can work both as a mapper and
a reducer.

Here is the command:
*shell> hadoop jar hadoop-streaming.jar -file mapper.py -mapper mapper.py
-combiner aggregate -reducer NONE -input input_files -output output_path*

Executing this command fails the mapper with this error:
*java.io.IOException: Cannot run program "aggregate": java.io.IOException:
error=2, No such file or directory*

However, if you run this command using aggregate as the reducer and not the
combiner, the job works fine.
*shell> hadoop jar hadoop-streaming.jar -file mapper.py -mapper mapper.py
-reduce aggregate -input input_files -output output_path*

What am I doing wrong? Is aggregate treated as a command and not a
JavaClassName? If yes, how do I use the JavaClassName instead?

-- 
Regards,
Premal Shah.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message