hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Emma Lin <l...@vmware.com>
Subject Issues during setting up hadoop security cluster
Date Fri, 20 Jan 2012 04:52:14 GMT
Gurus,
I'm setting up a security cluster of hadoop .23. But now, the communication between Data Node
and Name Node, Node Manager and Resource Manager have problem.
When I start the Node Manager, it will report following error, and then shutdown itself. Did
you ever see such issue? Do you have any idea on how to triage this issue?

2012-01-20 12:03:08,258 INFO  ipc.HadoopYarnRPC (HadoopYarnProtoRPC.java:getProxy(48)) - Creating
a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.server.api.ResourceTracker
2012-01-20 12:03:08,291 INFO  nodemanager.NodeStatusUpdaterImpl (NodeStatusUpdaterImpl.java:registerWithRM(155))
- Connected to ResourceManager at hadoopRM.example.aurora:9003
2012-01-20 12:03:20,399 WARN  ipc.Client (Client.java:run(526)) - Couldn't setup connection
for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA
2012-01-20 12:03:20,405 ERROR service.CompositeService (CompositeService.java:start(72)) -
Error starting services org.apache.hadoop.yarn.server.nodemanager.NodeManager
org.apache.avro.AvroRuntimeException: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
        at org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
        at org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
Caused by: java.lang.reflect.UndeclaredThrowableException
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
        at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
        ... 3 more
Caused by: com.google.protobuf.ServiceException: java.io.IOException: Failed on local exception:
java.io.IOException: Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA
to rm/hadoopRM.example.aurora@EXAMPLE.AURORA; Host Details : local host is: "hadoopNM/10.112.127.102";
destination host is: ""hadoopRM.example.aurora":9003;
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
        at $Proxy14.registerNodeManager(Unknown Source)
        at org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
        ... 5 more
Caused by: java.io.IOException: Failed on local exception: java.io.IOException: Couldn't setup
connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA to rm/hadoopRM.example.aurora@EXAMPLE.AURORA;
Host Details : local host is: "hadoopNM/10.112.127.102"; destination host is: ""hadoopRM.example.aurora":9003;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:655)
        at org.apache.hadoop.ipc.Client.call(Client.java:1089)
        at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
        ... 7 more
Caused by: java.io.IOException: Couldn't setup connection for nm/hadoopNM.example.aurora@EXAMPLE.AURORA
to rm/hadoopRM.example.aurora@EXAMPLE.AURORA
        at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:527)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
        at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:499)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
        at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
        at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
        at org.apache.hadoop.ipc.Client.call(Client.java:1065)
        ... 8 more
Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException:
No valid credentials provided (Mechanism level: Server not found in Kerberos database (7)
- UNKNOWN_SERVER)]
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194)
        at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137)
        at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:407)
        at org.apache.hadoop.ipc.Client$Connection.access$1200(Client.java:205)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:576)
        at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:573)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1152)
        at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:572)
        ... 11 more
Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found
in Kerberos database (7) - UNKNOWN_SERVER)
        at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230)
        at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162)
        at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175)
        ... 20 more
Caused by: KrbException: Server not found in Kerberos database (7) - UNKNOWN_SERVER
        at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:64)
        at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185)
        at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294)
        at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106)
        at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557)
        at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594)
        ... 23 more
Caused by: KrbException: Identifier doesn't match expected value (906)
        at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133)
        at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58)
        at sun.security.krb5.internal.TGSRep.<init>(TGSRep.java:53)
        at sun.security.krb5.KrbTgsRep.<init>(KrbTgsRep.java:46)
        ... 28 more

The error said that no valid server credential, but I've add those credentials in Resource
Manager node. The keytab result is as following:
line@hadoopRM:~$ klist -k -e -t /etc/krb5.keytab
Keytab name: WRFILE:/etc/krb5.keytab
KVNO Timestamp         Principal
---- ----------------- --------------------------------------------------------
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/20/12 10:55:02 rm/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/19/12 11:19:11 host/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (aes256-cts-hmac-sha1-96)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (arcfour-hmac)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (des3-cbc-sha1)
   2 01/19/12 11:20:15 jhs/hadoopRM.example.aurora@EXAMPLE.AURORA (des-cbc-crc)

The whole node manager log is attached.

Any idea is appreciated.
Thanks
Emma

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message