hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Hadoop Streaming job Fails - Permission Denied error
Date Tue, 13 Sep 2011 08:06:07 GMT
The env binary would be present, but do all your TT nodes have python
properly installed on them? The env program can't find them and that's
probably why your scripts with shbang don't run.

On Tue, Sep 13, 2011 at 1:12 PM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
> Thanks Jeremy. But I didn't follow 'redirect "stdout" to "stderr" at the
> entry point to your mapper and reducer'.
> Basically I'm a java hadoop developer and has no idea on python programming.
> Could you please help me with mode details like the line of code i need to
> include to achieve this.
>
> Also I tried a still more deep drill down on my error logs and found the
> following line as well
>
> stderr logs
>
> /usr/bin/env: python
> : No such file or directory
> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
> failed with code 127
>     at
> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>     at
> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>     at org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
>     at
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:478)
>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
>     at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>     at java.security.AccessController.doPrivileged(Native Method)
>     at javax.security.auth.Subject.doAs(Subject.java:396)
>     at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>     at org.apache.hadoop.mapred.Child.main(Child.java:262)
> log4j:WARN No appenders could be found for logger
> (org.apache.hadoop.hdfs.DFSClient).
> log4j:WARN Please initialize the log4j system properly.
>
> I verified on the existence of such a directory and it was present
> '/usr/bin/env' .
>
> Could you please provide little more guidance on the same.
>
>
>
> On Tue, Sep 13, 2011 at 9:06 AM, Jeremy Lewi <jeremy@lewi.us> wrote:
>>
>> Bejoy,
>> The other problem I typically ran into using python streaming jobs was if
>> my mapper or reducer wrote to stdout. Since hadoop uses stdout to pass data
>> back to Hadoop, any erroneous "print" statements will cause the pipe to
>> break. The easiest way around this is to redirect "stdout" to "stderr" at
>> the entry point to your mapper and reducer; do this even before you import
>> any modules so that even if those modules call "print" it gets redirected.
>> Note: if your using dumbo (but I don't think you are) the above solution
>> may not work but I can send you a pointer.
>> J
>>
>> On Mon, Sep 12, 2011 at 8:27 AM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
>>>
>>> Thanks Jeremy. I tried with your first suggestion and the mappers ran
>>> into completion. But then the reducers failed with another exception related
>>> to pipes. I believe it may be due to permission issues again. I tried
>>> setting a few additional config parameters but it didn't do the job. Please
>>> find the command used and the error logs from jobtracker web UI
>>>
>>> hadoop  jar
>>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar
>>> -D hadoop.tmp.dir=/home/streaming/tmp/hadoop/ -D
>>> dfs.data.dir=/home/streaming/tmp -D
>>> mapred.local.dir=/home/streaming/tmp/local -D
>>> mapred.system.dir=/home/streaming/tmp/system -D
>>> mapred.temp.dir=/home/streaming/tmp/temp -input
>>> /userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output
>>> -mapper /home/streaming/WcStreamMap.py  -reducer
>>> /home/streaming/WcStreamReduce.py
>>>
>>>
>>> java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess
>>> failed with code 127
>>>     at
>>> org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
>>>     at
>>> org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
>>>     at
>>> org.apache.hadoop.streaming.PipeReducer.close(PipeReducer.java:137)
>>>     at
>>> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:478)
>>>     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:416)
>>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>>     at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>>     at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>>
>>>
>>> The folder permissions at the time of job execution are as follows
>>>
>>> cloudera@cloudera-vm:~$ ls -l  /home/streaming/
>>> drwxrwxrwx 5 root root 4096 2011-09-12 05:59 tmp
>>> -rwxrwxrwx 1 root root  707 2011-09-11 23:42 WcStreamMap.py
>>> -rwxrwxrwx 1 root root 1077 2011-09-11 23:42 WcStreamReduce.py
>>>
>>> cloudera@cloudera-vm:~$ ls -l /home/streaming/tmp/
>>> drwxrwxrwx 2 root root 4096 2011-09-12 06:12 hadoop
>>> drwxrwxrwx 2 root root 4096 2011-09-12 05:58 local
>>> drwxrwxrwx 2 root root 4096 2011-09-12 05:59 system
>>> drwxrwxrwx 2 root root 4096 2011-09-12 05:59 temp
>>>
>>> Am I missing some thing here?
>>>
>>> It is not for long I'm into Linux so couldn't try your second suggestion
>>> on setting up the Linux task controller.
>>>
>>> Thanks a lot
>>>
>>> Regards
>>> Bejoy.K.S
>>>
>>>
>>>
>>> On Mon, Sep 12, 2011 at 6:20 AM, Jeremy Lewi <jeremy@lewi.us> wrote:
>>>>
>>>> I would suggest you try putting your mapper/reducer py files in a
>>>> directory that is world readable at every level . i.e /tmp/test. I had
>>>> similar problems when I was using streaming and I believe my workaround was
>>>> to put the mapper/reducers outside my home directory. The other more
>>>> involved alternative is to setup the linux task controller so you can run
>>>> your MR jobs as the user who submits the jobs.
>>>> J
>>>>
>>>> On Mon, Sep 12, 2011 at 2:18 AM, Bejoy KS <bejoy.hadoop@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi
>>>>>       I wanted to try out hadoop steaming and got the sample python
>>>>> code for mapper and reducer. I copied both into my lfs and tried running
the
>>>>> steaming job as mention in the documentation.
>>>>> Here the command i used to run the job
>>>>>
>>>>> hadoop  jar
>>>>> /usr/lib/hadoop-0.20/contrib/streaming/hadoop-streaming-0.20.2-cdh3u0.jar
>>>>> -input /userdata/bejoy/apps/wc/input -output /userdata/bejoy/apps/wc/output
>>>>> -mapper /home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py  -reducer
>>>>> /home/cloudera/bejoy/apps/inputs/wc/WcStreamReduce.py
>>>>>
>>>>> Here other than input and output the rest all are on lfs locations. How
>>>>> ever the job is failing. The error log from the jobtracker url is as
>>>>>
>>>>> java.lang.RuntimeException: Error in configuring object
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:386)
>>>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:324)
>>>>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
>>>>>     at java.security.AccessController.doPrivileged(Native Method)
>>>>>     at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>>     at
>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>>>>     at org.apache.hadoop.mapred.Child.main(Child.java:262)
>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>     at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>     at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>>>>     ... 9 more
>>>>> Caused by: java.lang.RuntimeException: Error in configuring object
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:93)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:64)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>>>>>     at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:34)
>>>>>     ... 14 more
>>>>> Caused by: java.lang.reflect.InvocationTargetException
>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>     at
>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>     at
>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>     at
>>>>> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:88)
>>>>>     ... 17 more
>>>>> Caused by: java.lang.RuntimeException: configuration exception
>>>>>     at
>>>>> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:230)
>>>>>     at
>>>>> org.apache.hadoop.streaming.PipeMapper.configure(PipeMapper.java:66)
>>>>>     ... 22 more
>>>>> Caused by: java.io.IOException: Cannot run program
>>>>> "/home/cloudera/bejoy/apps/inputs/wc/WcStreamMap.py": java.io.IOException:
>>>>> error=13, Permission denied
>>>>>     at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
>>>>>     at
>>>>> org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
>>>>>     ... 23 more
>>>>> Caused by: java.io.IOException: java.io.IOException: error=13,
>>>>> Permission denied
>>>>>     at java.lang.UNIXProcess.<init>(UNIXProcess.java:148)
>>>>>     at java.lang.ProcessImpl.start(ProcessImpl.java:65)
>>>>>     at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
>>>>>     ... 24 more
>>>>>
>>>>> On the error I checked the permissions of mapper and reducer. Issued
a
>>>>> chmod 777 command as well. Still no luck.
>>>>>
>>>>> The permission of the files are as follows
>>>>> cloudera@cloudera-vm:~$ ls -l /home/cloudera/bejoy/apps/inputs/wc/
>>>>> -rwxrwxrwx 1 cloudera cloudera  707 2011-09-11 23:42 WcStreamMap.py
>>>>> -rwxrwxrwx 1 cloudera cloudera 1077 2011-09-11 23:42 WcStreamReduce.py
>>>>>
>>>>> I'm testing the same on Cloudera Demo VM. So the hadoop setup would be
>>>>> on pseudo distributed mode. Any help would be highly appreciated.
>>>>>
>>>>> Thank You
>>>>>
>>>>> Regards
>>>>> Bejoy.K.S
>>>>>
>>>>
>>>
>>
>
>



-- 
Harsh J

Mime
View raw message