Return-Path: X-Original-To: apmail-hadoop-yarn-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5027018D53 for ; Mon, 13 Jul 2015 21:48:05 +0000 (UTC) Received: (qmail 79873 invoked by uid 500); 13 Jul 2015 21:48:04 -0000 Delivered-To: apmail-hadoop-yarn-dev-archive@hadoop.apache.org Received: (qmail 79771 invoked by uid 500); 13 Jul 2015 21:48:04 -0000 Mailing-List: contact yarn-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-dev@hadoop.apache.org Delivered-To: mailing list yarn-dev@hadoop.apache.org Received: (qmail 79462 invoked by uid 99); 13 Jul 2015 21:48:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2015 21:48:04 +0000 Date: Mon, 13 Jul 2015 21:48:04 +0000 (UTC) From: "Zack Marsh (JIRA)" To: yarn-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (YARN-3921) Permission denied errors for local usercache directories when attempting to run MapReduce job on Kerberos enabled cluster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Zack Marsh created YARN-3921: -------------------------------- Summary: Permission denied errors for local usercache directories when attempting to run MapReduce job on Kerberos enabled cluster Key: YARN-3921 URL: https://issues.apache.org/jira/browse/YARN-3921 Project: Hadoop YARN Issue Type: Bug Affects Versions: 2.7.1 Environment: sles11sp3 Reporter: Zack Marsh Prior to enabling Kerberos on the Hadoop cluster, I am able to run a simple MapReduce example as the Linux user 'tdatuser': {code} iripiri1:~ # su tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:02:31 INFO impl.TimelineClientImpl: Timeline service address: http:/ s/v1/timeline/ 15/07/13 17:02:31 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to 15/07/13 17:02:31 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 15/07/13 17:02:32 INFO impl.YarnClientImpl: Submitted application application_14 15/07/13 17:02:32 INFO mapreduce.Job: The url to track the job: http://piripiri3 cation_1436821014431_0003/ 15/07/13 17:02:32 INFO mapreduce.Job: Running job: job_1436821014431_0003 15/07/13 17:05:50 INFO mapreduce.Job: Job job_1436821014431_0003 running in uber mode : false 15/07/13 17:05:50 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:05:56 INFO mapreduce.Job: map 6% reduce 0% 15/07/13 17:06:00 INFO mapreduce.Job: map 13% reduce 0% 15/07/13 17:06:01 INFO mapreduce.Job: map 38% reduce 0% 15/07/13 17:06:05 INFO mapreduce.Job: map 44% reduce 0% 15/07/13 17:06:07 INFO mapreduce.Job: map 63% reduce 0% 15/07/13 17:06:09 INFO mapreduce.Job: map 69% reduce 0% 15/07/13 17:06:11 INFO mapreduce.Job: map 75% reduce 0% 15/07/13 17:06:12 INFO mapreduce.Job: map 81% reduce 0% 15/07/13 17:06:13 INFO mapreduce.Job: map 81% reduce 25% 15/07/13 17:06:14 INFO mapreduce.Job: map 94% reduce 25% 15/07/13 17:06:16 INFO mapreduce.Job: map 100% reduce 31% 15/07/13 17:06:17 INFO mapreduce.Job: map 100% reduce 100% 15/07/13 17:06:17 INFO mapreduce.Job: Job job_1436821014431_0003 completed successfully 15/07/13 17:06:17 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=358 FILE: Number of bytes written=2249017 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=4198 HDFS: Number of bytes written=215 HDFS: Number of read operations=67 HDFS: Number of large read operations=0 HDFS: Number of write operations=3 Job Counters Launched map tasks=16 Launched reduce tasks=1 Data-local map tasks=16 Total time spent by all maps in occupied slots (ms)=160498 Total time spent by all reduces in occupied slots (ms)=27302 Total time spent by all map tasks (ms)=80249 Total time spent by all reduce tasks (ms)=13651 Total vcore-seconds taken by all map tasks=80249 Total vcore-seconds taken by all reduce tasks=13651 Total megabyte-seconds taken by all map tasks=246524928 Total megabyte-seconds taken by all reduce tasks=41935872 Map-Reduce Framework Map input records=16 Map output records=32 Map output bytes=288 Map output materialized bytes=448 Input split bytes=2310 Combine input records=0 Combine output records=0 Reduce input groups=2 Reduce shuffle bytes=448 Reduce input records=32 Reduce output records=0 Spilled Records=64 Shuffled Maps =16 Failed Shuffles=0 Merged Map outputs=16 GC time elapsed (ms)=1501 CPU time spent (ms)=13670 Physical memory (bytes) snapshot=13480296448 Virtual memory (bytes) snapshot=72598511616 Total committed heap usage (bytes)=12508463104 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=1888 File Output Format Counters Bytes Written=97 Job Finished in 226.813 seconds Estimated value of Pi is 3.14127500000000000000 {code} However, after enabling Kerberos, the job fails: {code} tdatuser@piripiri1:/root> kinit -kt /etc/security/keytabs/tdatuser.headless.keytab tdatuser tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 Number of Maps = 16 Samples per Map = 10000 Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Wrote input for Map #3 Wrote input for Map #4 Wrote input for Map #5 Wrote input for Map #6 Wrote input for Map #7 Wrote input for Map #8 Wrote input for Map #9 Wrote input for Map #10 Wrote input for Map #11 Wrote input for Map #12 Wrote input for Map #13 Wrote input for Map #14 Wrote input for Map #15 Starting Job 15/07/13 17:27:05 INFO impl.TimelineClientImpl: Timeline service address: http://piripiri1.labs.teradata.com:8188/ws/v1/timeline/ 15/07/13 17:27:05 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 140 for tdatuser on ha-hdfs:PIRIPIRI 15/07/13 17:27:05 INFO security.TokenCache: Got dt for hdfs://PIRIPIRI; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO input.FileInputFormat: Total input paths to process : 16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: number of splits:16 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) 15/07/13 17:27:06 INFO impl.YarnClientImpl: Submitted application application_1436822321287_0007 15/07/13 17:27:06 INFO mapreduce.Job: The url to track the job: http://piripiri2.labs.teradata.com:8088/proxy/application_1436822321287_0007/ 15/07/13 17:27:06 INFO mapreduce.Job: Running job: job_1436822321287_0007 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 running in uber mode : false 15/07/13 17:27:09 INFO mapreduce.Job: map 0% reduce 0% 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 failed with state FAILED due to: Application application_1436822321287_0007 failed 2 times due to AM Container for appattempt_1436822321287_0007_000002 exited with exitCode: -1000 For more detailed output, check application tracking page:http://piripiri2.labs.teradata.com:8088/cluster/app/application_1436822321287_0007Then, click on links to logs of each attempt. Diagnostics: Application application_1436822321287_0007 initialization failed (exitCode=255) with output: main : command provided 0 main : run as user is tdatuser main : requested yarn user is tdatuser Can't create directory /data1/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data2/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data3/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data4/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data5/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data6/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data7/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data8/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data9/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data10/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data11/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Can't create directory /data12/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied Did not create any app directories Failing this attempt. Failing the application. 15/07/13 17:27:09 INFO mapreduce.Job: Counters: 0 Job Finished in 4.748 seconds java.io.FileNotFoundException: File does not exist: hdfs://PIRIPIRI/user/tdatuser/QuasiMonteCarlo_1436822823095_2120947622/out/reduce-out at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1752) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1776) at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) {code} As seen above there are many "Can't create directory... Permission denied errors" related to the local usercache directory for the 'tdatuser'. Prior to enabling Kerberos, the contents of a usercache directory was as follows: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-xr-x 3 yarn hadoop 21 Jul 13 16:59 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} After enabling Kerberos the contents are: {code} piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ total 0 drwxr-s--- 4 ambari-qa hadoop 37 Jul 13 17:21 ambari-qa drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser {code} It appears that the owner of the usercache directory for the 'ambari-qa' user was updated, but the 'tdatuser' directory was not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)