hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Penney <cpen...@gmail.com>
Subject Re: Tasktracker Permission Issue?
Date Mon, 23 Sep 2013 13:46:33 GMT
I resolved this by setting the umask such that mapred created files with
group having read permission.  I discovered that for whatever reason the
tasktracker running as mapred was trying to read the job.xml file before
the permissions were effectively setup by the child.  I was running an
strace -f on the parent tasktracker process it I would see:

   * Child: open job.xml (with O_WRONLY|O_CREAT|O_TRUNC and mode 0666)
   * Child: chmod 777 job.xml
   * Child: chmod 640 job.xml
   * Parent: stat job.xml (permissions denied)

I'm assuming either strace isn't showing the parent stat in the right spot
or some caching effect is causing the stat to see the original permission
from the open.  If I run ls in a tight loop I can see that job.xml is
created with 600 permission and it's like that for dozens of iterations of
ls (in a while `true` loop).  This strikes me as some kind of bug.

    Chris



On Wed, Sep 18, 2013 at 3:06 PM, Christopher Penney <cpenney@gmail.com>wrote:

>
> Here is some more info.  I realized if I run the tasktracker as root it
> works, but if I run it as mapred (which I assume is what I'm supposed to
> do) I get the erros below.  When a job attempts running I see this under
> mapred.localdir.
>
> taskTracker:
> total 4
> drwxr-s--- 3 cpenney mapred 4096 Sep 18 14:53 cpenney
>
> taskTracker/cpenney:
> total 4
> drwx--S--- 3 cpenney mapred 4096 Sep 18 14:53 jobcache
>
> taskTracker/cpenney/jobcache:
> total 4
> drwx--S--- 4 cpenney mapred 4096 Sep 18 14:53 job_201309181359_0029
>
> taskTracker/cpenney/jobcache/job_201309181359_0029:
> total 88
> drwx--S--- 3 cpenney mapred  4096 Sep 18 14:53 jars
> -rw------- 1 cpenney mapred 72974 Sep 18 14:53 job.xml
> -rw------- 1 cpenney mapred   230 Sep 18 14:53 jobToken
> drwx--S--- 2 cpenney mapred  4096 Sep 18 14:53 work
>
> taskTracker/cpenney/jobcache/job_201309181359_0029/jars:
> total 6672
> -rw------- 1 cpenney mapred   52780 Sep 18 14:53 .job.jar.crc
> -rwxrwxrwx 1 cpenney mapred 6754700 Sep 18 14:53 job.jar
> drwx--S--- 3 cpenney mapred    4096 Sep 18 14:53 org
>
> taskTracker/cpenney/jobcache/job_201309181359_0029/jars/org:
> total 4
> drwx--S--- 3 cpenney mapred 4096 Sep 18 14:53 apache
>
> taskTracker/cpenney/jobcache/job_201309181359_0029/jars/org/apache:
> total 4
> drwx--S--- 10 cpenney mapred 4096 Sep 18 14:53 pig
> [snipped]
>
> But in the log I see:
>
> 2013-09-18 14:53:59,951 WARN org.apache.hadoop.mapred.TaskTracker: Error
> initializing attempt_201309181359_0029_m_000002_0:
> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
> taskTracker/cpenney/jobcache/job_201309181359_0029/job.xml in any of the
> configured local directories
>
> 2013-09-18 14:53:59,952 ERROR org.apache.hadoop.mapred.TaskStatus: Trying
> to set finish time for task attempt_201309181359_0029_m_000002_0 when no
> start time is set, stackTrace is : java.lang.Exception
>
> 2013-09-18 14:54:00,303 WARN org.apache.hadoop.mapred.TaskTracker:
> Exception while localization java.io.IOException: Job initialization failed
> (255) with output: Reading task controller config from
> /etc/hadoop/taskcontroller.cfg
>
> My taskcontroller.cfg file has:
>
> mapred.local.dir=/tmp/hadoop/mapred
> hadoop.log.dir=/var/log/hadoop
> mapred.tasktracker.tasks.sleeptime-before-sigkill=30
> mapreduce.tasktracker.group=mapred
> banned.users=mapred,hdfs
> min.user.id=120
>
> In /etc/hadoop I have:
>
> ---Sr-s--- 1 root   mapred 63382 Nov 19  2012 task-controller
> -rw-r--r-- 1 root   mapred   196 Sep 18 14:30 taskcontroller.cfg
>
>
>    Chris
>
>
>
>
> On Wed, Sep 18, 2013 at 1:26 PM, Vinod Kumar Vavilapalli <
> vinodkv@apache.org> wrote:
>
>> What is your config set to for mapred local dirs? And what are the
>> permissions to those directories?
>>
>> All users need executable permissions in all the paths up to the
>> local-dir so that they can create their own directories in there. For e.g.
>> if one of the mapred local dir is /a/b/c/mapred, then all of /a, /a/b,
>> /a/b/c etc need to be executable by everyone - an executable permission is
>> needed in a linux dir for someone to be able to create files/dir in some of
>> the sub-directories.
>>
>>  Thanks,
>> +Vinod Kumar Vavilapalli
>> Hortonworks Inc.
>> http://hortonworks.com/
>>
>> On Sep 18, 2013, at 7:26 AM, Christopher Penney wrote:
>>
>> I have a test environment with hadoop 1.1.1 setup with Kerberos and
>> yesterday I zapped my mapred.local.dir on the job and task trackers as part
>> of some cleanup.  When I started the task trackers back up I was unable to
>> run MR jobs.  This seems like a permission issue, but I can't figure out
>> what it would be since it auto creates everything.  I didn't make any
>> changes to taskcontroller.cfg or mapred-site.xml.  Below is a log from the
>> task tracker.
>>
>>    Chris
>>
>> 2013-09-18 10:21:27,040 INFO org.apache.hadoop.mapred.TaskTracker:
>> LaunchTaskAction (registerTask): attempt_201309180916_0024_m_000002_0
>> task's state:UNASSIGNED
>> 2013-09-18 10:21:27,040 INFO org.apache.hadoop.mapred.TaskTracker: Trying
>> to launch : attempt_201309180916_0024_m_000002_0 which needs 1 slots
>> 2013-09-18 10:21:27,040 INFO org.apache.hadoop.mapred.TaskTracker: In
>> TaskLauncher, current free slots : 16 and trying to launch
>> attempt_201309180916_0024_m_000002_0 which needs 1 slots
>> 2013-09-18 10:21:28,524 WARN org.apache.hadoop.mapred.TaskTracker: Error
>> initializing attempt_201309180916_0024_m_000002_0:
>> org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find
>> taskTracker/cpenney/jobcache/job_201309180916_0024/job.xml in any of the
>> configured local directories
>>  at
>> org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:429)
>>  at
>> org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:160)
>>   at
>> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1341)
>>  at
>> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
>>  at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
>>  at java.lang.Thread.run(Thread.java:662)
>>
>> 2013-09-18 10:21:28,525 ERROR org.apache.hadoop.mapred.TaskStatus: Trying
>> to set finish time for task attempt_201309180916_0024_m_000002_0 when no
>> start time is set, stackTrace is : java.lang.Exception
>>  at org.apache.hadoop.mapred.TaskStatus.setFinishTime(TaskStatus.java:145)
>>  at
>> org.apache.hadoop.mapred.TaskTracker$TaskInProgress.kill(TaskTracker.java:3285)
>>  at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2578)
>>  at java.lang.Thread.run(Thread.java:662)
>>
>> 2013-09-18 10:21:28,525 INFO org.apache.hadoop.mapred.TaskTracker:
>> addFreeSlot : current free slots : 16
>> 2013-09-18 10:21:28,554 INFO org.apache.hadoop.mapred.TaskTracker:
>> LaunchTaskAction (registerTask): attempt_201309180916_0024_m_000002_1
>> task's state:UNASSIGNED
>> 2013-09-18 10:21:28,554 INFO org.apache.hadoop.mapred.TaskTracker: Trying
>> to launch : attempt_201309180916_0024_m_000002_1 which needs 1 slots
>> 2013-09-18 10:21:28,554 INFO org.apache.hadoop.mapred.TaskTracker: In
>> TaskLauncher, current free slots : 16 and trying to launch
>> attempt_201309180916_0024_m_000002_1 which needs 1 slots
>> 2013-09-18 10:21:28,595 INFO org.apache.hadoop.mapred.TaskController:
>> Reading task controller config from /etc/hadoop/taskcontroller.cfg
>> 2013-09-18 10:21:28,595 INFO org.apache.hadoop.mapred.TaskController:
>> main : command provided 0
>> 2013-09-18 10:21:28,595 INFO org.apache.hadoop.mapred.TaskController:
>> main : user is cpenney
>> 2013-09-18 10:21:28,595 INFO org.apache.hadoop.mapred.TaskController:
>> Good mapred-local-dirs are /tmp/hadoop/mapred
>> 2013-09-18 10:21:28,595 INFO org.apache.hadoop.mapred.TaskController:
>> Can't open
>> /tmp/hadoop/mapred/taskTracker/cpenney/jobcache/job_201309180916_0024/jobToken
>> for output - File exists
>> 2013-09-18 10:21:28,596 WARN org.apache.hadoop.mapred.TaskTracker:
>> Exception while localization java.io.IOException: Job initialization failed
>> (255) with output: Reading task controller config from
>> /etc/hadoop/taskcontroller.cfg
>> main : command provided 0
>> main : user is cpenney
>> Good mapred-local-dirs are /tmp/hadoop/mapred
>> Can't open
>> /tmp/hadoop/mapred/taskTracker/cpenney/jobcache/job_201309180916_0024/jobToken
>> for output - File exists
>>
>>  at
>> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
>>  at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1323)
>>  at java.security.AccessController.doPrivileged(Native Method)
>>  at javax.security.auth.Subject.doAs(Subject.java:396)
>>  at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>  at
>> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1298)
>>  at
>> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
>>  at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
>>  at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.apache.hadoop.util.Shell$ExitCodeException:
>>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
>>  at org.apache.hadoop.util.Shell.run(Shell.java:182)
>>  at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>>  at
>> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:186)
>>  ... 8 more
>>
>> 2013-09-18 10:21:28,596 ERROR
>> org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
>> as:cpenney cause:java.io.IOException: Job initialization failed (255) with
>> output: Reading task controller config from /etc/hadoop/taskcontroller.cfg
>> main : command provided 0
>> main : user is cpenney
>> Good mapred-local-dirs are /tmp/hadoop/mapred
>> Can't open
>> /tmp/hadoop/mapred/taskTracker/cpenney/jobcache/job_201309180916_0024/jobToken
>> for output - File exists
>>
>> 2013-09-18 10:21:28,596 WARN org.apache.hadoop.mapred.TaskTracker: Error
>> initializing attempt_201309180916_0024_m_000002_1:
>> java.io.IOException: Job initialization failed (255) with output: Reading
>> task controller config from /etc/hadoop/taskcontroller.cfg
>> main : command provided 0
>> main : user is cpenney
>> Good mapred-local-dirs are /tmp/hadoop/mapred
>> Can't open
>> /tmp/hadoop/mapred/taskTracker/cpenney/jobcache/job_201309180916_0024/jobToken
>> for output - File exists
>>
>>  at
>> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:193)
>>  at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1323)
>>  at java.security.AccessController.doPrivileged(Native Method)
>>  at javax.security.auth.Subject.doAs(Subject.java:396)
>>  at
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
>>  at
>> org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1298)
>>  at
>> org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213)
>>  at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568)
>>  at java.lang.Thread.run(Thread.java:662)
>> Caused by: org.apache.hadoop.util.Shell$ExitCodeException:
>>  at org.apache.hadoop.util.Shell.runCommand(Shell.java:255)
>>  at org.apache.hadoop.util.Shell.run(Shell.java:182)
>>  at
>> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:375)
>>  at
>> org.apache.hadoop.mapred.LinuxTaskController.initializeJob(LinuxTaskController.java:186)
>>  ... 8 more
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Mime
View raw message