hadoop-mapreduce-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vivek Krishna <vivekris...@gmail.com>
Subject File Not found Bug. Hadoop Streaming
Date Fri, 03 Dec 2010 17:25:32 GMT
What?
   I am trying to run a hadoop streaming job.  I wrote a simple python
script called mapper.py and tested it by 'cat somefile.txt | python
mapper.py.'

The command:

   I tried using all combinations of paths in the following command

   *$HSTREAMING -Dmapred.reduce.tasks=0
-Dstream.non.zero.exit.is.failure=true *
*  -input /ixml*
*  -output /oxml *
*  -mapper mapper.py *
*  -file scripts/mapper.py*
*  -inputreader "StreamXmlRecordReader,begin=channel,end=/channel" *

PS:
I made sure mapper.py has execute permissions.
I tried -mapper '/usr/bin/python mapper.py'
I also tried giving full path of mapper.py
I tried without -file
I tried using couple other streaming jars,
    hadoop-0.20.1+169.89-streaming.jar
    hadoop-0.20.2+228-streaming.jar

Nothing seems to work!!!


The Error:
*java.io.IOException: Cannot run program "mapper.py": error=2, No such file
or directory*
*

	at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
	at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
	at org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)........

......


 ERROR org.apache.hadoop.streaming.PipeMapRed: configuration exception

*


To verify that streaming works, I tried giving bin/wc program as
mapper and it works!

I understand that -file option includes the file in the jar,  how do I
refer it in the command line?  As per
http://wiki.apache.org/hadoop/HadoopStreaming , I am following correct
instructions.

*Any help would be really appreciated.*


Regards,
~Viv

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message