Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Thu, 3 May 2012 21:40:49 +0000 (UTC)
From: "Harsh J (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: 
 <2128227630.24023.1336081249324.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Created] (HDFS-3366) Some stacktraces are now too lengthy
 and sometimes no good
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

Harsh J created HDFS-3366:
-----------------------------

             Summary: Some stacktraces are now too lengthy and sometimes no good
                 Key: HDFS-3366
                 URL: https://issues.apache.org/jira/browse/HDFS-3366
             Project: Hadoop HDFS
          Issue Type: Improvement
    Affects Versions: 2.0.0
            Reporter: Harsh J
            Priority: Minor


This is a high-on-nitpick ticket for the benefit of troubleshooting.

This is partially related to all the PB-changes we've had. And also partially related to Java/JVMs.

Take a case of an AccessControlException, which is pretty common in HDFS permissions layer. We  now get, due to several more calls added at the RPC layer for PB (or maybe something else, if am mistaken):
{code}
Caused by: org.apache.hadoop.security.AccessControlException: org.apache.hadoop.security.AccessControlException: Permission denied: user=yarn, access=WRITE, inode="/":hdfs:supergroup:drwxr-xr-x
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:205)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:186)
	at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:135)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:4204)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:4175)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:2565)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:2529)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:640)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:412)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:42618)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:448)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:891)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1661)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1657)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1204)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1655)

	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:205)
	at $Proxy10.mkdirs(Unknown Source)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:84)
	at $Proxy10.mkdirs(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:430)
	at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:1717)
	... 9 more
{code}

The "9 more" is what I was looking for, to identify the caller to debug on/find the exact directory. However it now gets eaten away cause just the mkdir-to-exception trace itself has grown quite a bit. Comparing this to 0.20, we have much fewer calls and that helps us see at least the real caller of mkdirs.

I'm actually not sure what causes Java to print "... X more" in these form of exception prints, but if thats controllable am all in favor of increasing its amount for HDFS (using new default java opts?). So that when an exception does occur, we don't get a nearly-unusable stacktrace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira