hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raviteja Chirala" <rte...@gmail.com>
Subject Re: Client mapred tries to renew a token with renewer specified as nobody
Date Tue, 03 Dec 2013 19:24:50 GMT
Alternatively you can schedule a cron job to do kinit every 20 hours or so. Just to renew token
before it expires. 

—
Sent from Mailbox for iPad

On Mon, Dec 2, 2013 at 9:12 AM, Rainer Toebbicke <rtb@pclella.cern.ch>
wrote:

> Hello,
> I am trying to understand why my long-running mapreduce jobs stop after 24 hours (approx)
on a secure cluster.
> This is on Cloudera CDH 4.3.0, hence hadoop 2.0.0, using mrv1 (not yarn), authentication
specified as "kerberos". Trying with a short-lived Kerberos ticket (1h) I see that it gets
renewed regularly. Still, the job crashes after 24 hours because the delegation token expires.
> On a test cluster with increased logging and shortened dfs.namenode.delegation.token.renew-interval
(for quicker debugging) I see that an immediate renew of the delegation token fails, and then
after the expiry period the Namenode log starts getting clobbered.
> Detail:
> 2013-12-02 15:57:08,461 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful
for tobbicke@CERN.CH (auth:TOKEN)
> 2013-12-02 15:57:08,462 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization successful for tobbicke@CERN.CH (auth:TOKEN) for protocol=interface org.apache.hadoop.hdfs.protocol.ClientProtocol
> 2013-12-02 15:57:08,500 INFO SecurityLogger.org.apache.hadoop.ipc.Server: Auth successful
for mapred/xxx.cern.ch@CERN.CH (auth:SIMPLE)
> 2013-12-02 15:57:08,540 INFO SecurityLogger.org.apache.hadoop.security.authorize.ServiceAuthorizationManager:
Authorization successful for mapred/xxx.cern.ch@CERN.CH (auth:KERBEROS) for protocol=interface
org.apache.hadoop.hdfs.protocol.ClientProtocol
> 2013-12-02 15:57:08,541 INFO org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager:
Token renewal requested for identifier: HDFS_DELEGATION_TOKEN token 12 for tobbicke
> 2013-12-02 15:57:08,541 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException
as:mapred/xxx.cern.ch@CERN.CH (auth:KERBEROS) cause:org.apache.hadoop.security.AccessControlException:
Client mapred tries to renew a token with renewer specified as nobody
> 2013-12-02 15:57:08,541 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 9000,
call org.apache.hadoop.hdfs.protocol.ClientProtocol.renewDelegationToken from 188.184.xxx.xxx:42031:
error: org.apache.hadoop.security.AccessControlException: Client mapred tries to renew a token
with renewer specified as nobody
> org.apache.hadoop.security.AccessControlException: Client mapred tries to renew a token
with renewer specified as nobody
>         at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.renewToken(AbstractDelegationTokenSecretManager.java:274)
>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renewDelegationToken(FSNamesystem.java:5319)
>         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.renewDelegationToken(NameNodeRpcServer.java:377)
>         at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.renewDelegationToken(ClientNamenodeProtocolServerSideTranslatorPB.java:814)
>         at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:45024)
>         at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1002)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1701)
>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1697)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1695)
> Is this as unhealthy as it looks? If the first (immediate) renewal fails I assume others
will share the same fate. Would that explain the 24-hour-lifetime on the "real" cluster and
what could be the reason? How does "nobody" come into the game here?
> In any case, linked to this or not, after dfs.namenode.delegation.token.renew-interval
ms the following is logged a zillion times:
> 2013-12-02 16:58:09,718 WARN SecurityLogger.org.apache.hadoop.ipc.Server: Auth failed
for 188.184.xxx.xxx:44979:null (DIGEST-MD5: IO error acquiring password)
> 2013-12-02 16:58:09,719 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 9000:
readAndProcess threw exception javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring
password [Caused by org.apache.hadoop.security.token.SecretManager$InvalidToken: token (HDFS_DELEGATION_TOKEN
token 12 for tobbicke) is expired] from client 188.184.xxx.xxx. Count of bytes read: 0
> javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password [Caused by
org.apache.hadoop.security.token.SecretManager$InvalidToken: token (HDFS_DELEGATION_TOKEN
token 12 for tobbicke) is expired]
>         at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:577)
>         at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(DigestMD5Server.java:226)
>         at org.apache.hadoop.ipc.Server$Connection.saslReadAndProcess(Server.java:1210)
>         at org.apache.hadoop.ipc.Server$Connection.readAndProcess(Server.java:1405)
>         at org.apache.hadoop.ipc.Server$Listener.doRead(Server.java:719)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.doRunLoop(Server.java:518)
>         at org.apache.hadoop.ipc.Server$Listener$Reader.run(Server.java:493)
> Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: token (HDFS_DELEGATION_TOKEN
token 12 for tobbicke) is expired
>         at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.retrievePassword(AbstractDelegationTokenSecretManager.java:227)
>         at org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.retrievePassword(AbstractDelegationTokenSecretManager.java:46)
>         at org.apache.hadoop.security.SaslRpcServer$SaslDigestCallbackHandler.getPassword(SaslRpcServer.java:194)
>         at org.apache.hadoop.security.SaslRpcServer$SaslDigestCallbackHandler.handle(SaslRpcServer.java:220)
>         at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(DigestMD5Server.java:568)
>         ... 6 more
> Any ideas?
> Rainer
Mime
View raw message