Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5746218197 for ; Mon, 13 Jul 2015 23:27:05 +0000 (UTC) Received: (qmail 33374 invoked by uid 500); 13 Jul 2015 23:27:05 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 33347 invoked by uid 500); 13 Jul 2015 23:27:05 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 33129 invoked by uid 99); 13 Jul 2015 23:27:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2015 23:27:05 +0000 Date: Mon, 13 Jul 2015 23:27:05 +0000 (UTC) From: "Zack Marsh (JIRA)" To: dev@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AMBARI-12402) Permission denied errors for local usercache directories when attempting to run MapReduce job on Kerberos enabled cluster MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-12402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625561#comment-14625561 ] Zack Marsh commented on AMBARI-12402: ------------------------------------- Changing the owner of these tdatuser usercache dirs, and the dirs/files within them seems to resolve this issue. Is this something Ambari should be doing during the Enable Kerberos Wizard? Is there better work-around, perhaps just removing these local usercache dirs all together prior to enabling Kerberos? > Permission denied errors for local usercache directories when attempting to run MapReduce job on Kerberos enabled cluster > ------------------------------------------------------------------------------------------------------------------------- > > Key: AMBARI-12402 > URL: https://issues.apache.org/jira/browse/AMBARI-12402 > Project: Ambari > Issue Type: Bug > Affects Versions: 2.1.0 > Environment: sles11sp3 > Reporter: Zack Marsh > Priority: Critical > > Prior to enabling Kerberos on an HDP-2.3 cluster, I am able to run a simple MapReduce example as the Linux user 'tdatuser': > {code} > piripiri1:~ # su tdatuser > tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 > Number of Maps = 16 > Samples per Map = 10000 > Wrote input for Map #0 > Wrote input for Map #1 > Wrote input for Map #2 > Wrote input for Map #3 > Wrote input for Map #4 > Wrote input for Map #5 > Wrote input for Map #6 > Wrote input for Map #7 > Wrote input for Map #8 > Wrote input for Map #9 > Wrote input for Map #10 > Wrote input for Map #11 > Wrote input for Map #12 > Wrote input for Map #13 > Wrote input for Map #14 > Wrote input for Map #15 > Starting Job > 15/07/13 17:02:31 INFO impl.TimelineClientImpl: Timeline service address: http:/ s/v1/timeline/ > 15/07/13 17:02:31 INFO client.ConfiguredRMFailoverProxyProvider: Failing over to > 15/07/13 17:02:31 INFO input.FileInputFormat: Total input paths to process : 16 > 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: number of splits:16 > 15/07/13 17:02:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_14 > 15/07/13 17:02:32 INFO impl.YarnClientImpl: Submitted application application_14 > 15/07/13 17:02:32 INFO mapreduce.Job: The url to track the job: http://piripiri3 cation_1436821014431_0003/ > 15/07/13 17:02:32 INFO mapreduce.Job: Running job: job_1436821014431_0003 > 15/07/13 17:05:50 INFO mapreduce.Job: Job job_1436821014431_0003 running in uber mode : false > 15/07/13 17:05:50 INFO mapreduce.Job: map 0% reduce 0% > 15/07/13 17:05:56 INFO mapreduce.Job: map 6% reduce 0% > 15/07/13 17:06:00 INFO mapreduce.Job: map 13% reduce 0% > 15/07/13 17:06:01 INFO mapreduce.Job: map 38% reduce 0% > 15/07/13 17:06:05 INFO mapreduce.Job: map 44% reduce 0% > 15/07/13 17:06:07 INFO mapreduce.Job: map 63% reduce 0% > 15/07/13 17:06:09 INFO mapreduce.Job: map 69% reduce 0% > 15/07/13 17:06:11 INFO mapreduce.Job: map 75% reduce 0% > 15/07/13 17:06:12 INFO mapreduce.Job: map 81% reduce 0% > 15/07/13 17:06:13 INFO mapreduce.Job: map 81% reduce 25% > 15/07/13 17:06:14 INFO mapreduce.Job: map 94% reduce 25% > 15/07/13 17:06:16 INFO mapreduce.Job: map 100% reduce 31% > 15/07/13 17:06:17 INFO mapreduce.Job: map 100% reduce 100% > 15/07/13 17:06:17 INFO mapreduce.Job: Job job_1436821014431_0003 completed successfully > 15/07/13 17:06:17 INFO mapreduce.Job: Counters: 49 > File System Counters > FILE: Number of bytes read=358 > FILE: Number of bytes written=2249017 > FILE: Number of read operations=0 > FILE: Number of large read operations=0 > FILE: Number of write operations=0 > HDFS: Number of bytes read=4198 > HDFS: Number of bytes written=215 > HDFS: Number of read operations=67 > HDFS: Number of large read operations=0 > HDFS: Number of write operations=3 > Job Counters > Launched map tasks=16 > Launched reduce tasks=1 > Data-local map tasks=16 > Total time spent by all maps in occupied slots (ms)=160498 > Total time spent by all reduces in occupied slots (ms)=27302 > Total time spent by all map tasks (ms)=80249 > Total time spent by all reduce tasks (ms)=13651 > Total vcore-seconds taken by all map tasks=80249 > Total vcore-seconds taken by all reduce tasks=13651 > Total megabyte-seconds taken by all map tasks=246524928 > Total megabyte-seconds taken by all reduce tasks=41935872 > Map-Reduce Framework > Map input records=16 > Map output records=32 > Map output bytes=288 > Map output materialized bytes=448 > Input split bytes=2310 > Combine input records=0 > Combine output records=0 > Reduce input groups=2 > Reduce shuffle bytes=448 > Reduce input records=32 > Reduce output records=0 > Spilled Records=64 > Shuffled Maps =16 > Failed Shuffles=0 > Merged Map outputs=16 > GC time elapsed (ms)=1501 > CPU time spent (ms)=13670 > Physical memory (bytes) snapshot=13480296448 > Virtual memory (bytes) snapshot=72598511616 > Total committed heap usage (bytes)=12508463104 > Shuffle Errors > BAD_ID=0 > CONNECTION=0 > IO_ERROR=0 > WRONG_LENGTH=0 > WRONG_MAP=0 > WRONG_REDUCE=0 > File Input Format Counters > Bytes Read=1888 > File Output Format Counters > Bytes Written=97 > Job Finished in 226.813 seconds > Estimated value of Pi is 3.14127500000000000000 > {code} > However, after enabling Kerberos, the job fails: > {code} > tdatuser@piripiri1:/root> kinit -kt /etc/security/keytabs/tdatuser.headless.keytab tdatuser > tdatuser@piripiri1:/root> yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar pi 16 10000 > Number of Maps = 16 > Samples per Map = 10000 > Wrote input for Map #0 > Wrote input for Map #1 > Wrote input for Map #2 > Wrote input for Map #3 > Wrote input for Map #4 > Wrote input for Map #5 > Wrote input for Map #6 > Wrote input for Map #7 > Wrote input for Map #8 > Wrote input for Map #9 > Wrote input for Map #10 > Wrote input for Map #11 > Wrote input for Map #12 > Wrote input for Map #13 > Wrote input for Map #14 > Wrote input for Map #15 > Starting Job > 15/07/13 17:27:05 INFO impl.TimelineClientImpl: Timeline service address: http://piripiri1.labs.teradata.com:8188/ws/v1/timeline/ > 15/07/13 17:27:05 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 140 for tdatuser on ha-hdfs:PIRIPIRI > 15/07/13 17:27:05 INFO security.TokenCache: Got dt for hdfs://PIRIPIRI; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) > 15/07/13 17:27:06 INFO input.FileInputFormat: Total input paths to process : 16 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: number of splits:16 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1436822321287_0007 > 15/07/13 17:27:06 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:PIRIPIRI, Ident: (HDFS_DELEGATION_TOKEN token 140 for tdatuser) > 15/07/13 17:27:06 INFO impl.YarnClientImpl: Submitted application application_1436822321287_0007 > 15/07/13 17:27:06 INFO mapreduce.Job: The url to track the job: http://piripiri2.labs.teradata.com:8088/proxy/application_1436822321287_0007/ > 15/07/13 17:27:06 INFO mapreduce.Job: Running job: job_1436822321287_0007 > 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 running in uber mode : false > 15/07/13 17:27:09 INFO mapreduce.Job: map 0% reduce 0% > 15/07/13 17:27:09 INFO mapreduce.Job: Job job_1436822321287_0007 failed with state FAILED due to: Application application_1436822321287_0007 failed 2 times due to AM Container for appattempt_1436822321287_0007_000002 exited with exitCode: -1000 > For more detailed output, check application tracking page:http://piripiri2.labs.teradata.com:8088/cluster/app/application_1436822321287_0007Then, click on links to logs of each attempt. > Diagnostics: Application application_1436822321287_0007 initialization failed (exitCode=255) with output: main : command provided 0 > main : run as user is tdatuser > main : requested yarn user is tdatuser > Can't create directory /data1/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data2/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data3/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data4/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data5/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data6/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data7/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data8/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data9/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data10/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data11/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Can't create directory /data12/hadoop/yarn/local/usercache/tdatuser/appcache/application_1436822321287_0007 - Permission denied > Did not create any app directories > Failing this attempt. Failing the application. > 15/07/13 17:27:09 INFO mapreduce.Job: Counters: 0 > Job Finished in 4.748 seconds > java.io.FileNotFoundException: File does not exist: hdfs://PIRIPIRI/user/tdatuser/QuasiMonteCarlo_1436822823095_2120947622/out/reduce-out > at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309) > at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301) > at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1301) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1752) > at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1776) > at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:314) > at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:354) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:363) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71) > at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) > at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:483) > at org.apache.hadoop.util.RunJar.run(RunJar.java:221) > at org.apache.hadoop.util.RunJar.main(RunJar.java:136) > {code} > As seen above there are many "Can't create directory... Permission denied errors" related to the local usercache directory for the 'tdatuser'. > Prior to enabling Kerberos, the contents of a usercache directory was as follows: > {code} > piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ > total 0 > drwxr-xr-x 3 yarn hadoop 21 Jul 13 16:59 ambari-qa > drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser > {code} > After enabling Kerberos the contents are: > {code} > piripiri4:~ # ls -l /data1/hadoop/yarn/local/usercache/ > total 0 > drwxr-s--- 4 ambari-qa hadoop 37 Jul 13 17:21 ambari-qa > drwxr-x--- 4 yarn hadoop 37 Jul 13 17:00 tdatuser > {code} > It appears that the owner of the usercache directory for the 'ambari-qa' user was updated, but the 'tdatuser' directory was not. > Is this expected behavior, and is there a recommended work-around for this issue? -- This message was sent by Atlassian JIRA (v6.3.4#6332)