Return-Path: X-Original-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CDB33D315 for ; Thu, 27 Sep 2012 07:30:10 +0000 (UTC) Received: (qmail 25841 invoked by uid 500); 27 Sep 2012 07:30:10 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 25728 invoked by uid 500); 27 Sep 2012 07:30:10 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 25689 invoked by uid 99); 27 Sep 2012 07:30:09 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Sep 2012 07:30:09 +0000 Date: Thu, 27 Sep 2012 18:30:09 +1100 (NCT) From: "Eli Collins (JIRA)" To: common-issues@hadoop.apache.org Message-ID: <1200280057.132665.1348731009513.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HADOOP-8828) Support distcp from secure to insecure clusters MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HADOOP-8828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464523#comment-13464523 ] Eli Collins commented on HADOOP-8828: ------------------------------------- Btw here's the current failure mode for secure to insecure using hftp and webhdfs (thanks to Stephen Chu). Caused by: java.io.IOException: Couldn't setup connection for schu@HAL.CLOUDERA.COM to hdfs/c1204.hal.cloudera.com@HAL.CLOUDERA.COM at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:540) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:512) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:596) at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:220) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1213) at org.apache.hadoop.ipc.Client.call(Client.java:1140) ... 22 more Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database\ (7) - UNKNOWN_SERVER)] at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:194) at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:137) at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:423) at org.apache.hadoop.ipc.Client$Connection.access$1300(Client.java:220) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:589) at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:586) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:585) ... 25 more Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7) - UNKNOWN_SERVER) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:663) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:230) at sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:162) at com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:175) ... 34 more Caused by: KrbException: Server not found in Kerberos database (7) - UNKNOWN_SERVER at sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:64) at sun.security.krb5.KrbTgsReq.getReply(KrbTgsReq.java:185) at sun.security.krb5.internal.CredentialsUtil.serviceCreds(CredentialsUtil.java:294) at sun.security.krb5.internal.CredentialsUtil.acquireServiceCreds(CredentialsUtil.java:106) at sun.security.krb5.Credentials.acquireServiceCreds(Credentials.java:557) at sun.security.jgss.krb5.Krb5Context.initSecContext(Krb5Context.java:594) ... 37 more Caused by: KrbException: Identifier doesn't match expected value (906) at sun.security.krb5.internal.KDCRep.init(KDCRep.java:133) at sun.security.krb5.internal.TGSRep.init(TGSRep.java:58) at sun.security.krb5.internal.TGSRep.(TGSRep.java:53) at sun.security.krb5.KrbTgsRep.(KrbTgsRep.java:46) ... 42 more > Support distcp from secure to insecure clusters > ----------------------------------------------- > > Key: HADOOP-8828 > URL: https://issues.apache.org/jira/browse/HADOOP-8828 > Project: Hadoop Common > Issue Type: Bug > Reporter: Eli Collins > > Users currently can't distcp from secure to insecure clusters. > Relevant background from ATM: > There's no plumbing to make the HFTP client use AuthenticatedURL in the case security is enabled. This means that even though you have the servlet filter correctly configured on the server, the client doesn't know how to properly authenticate to that filter. > The crux of the issue is that security is enabled globally instead of per-file system. The trick of using HFTP as the source FS works when the source is insecure, but not the source is secure. > Normal cp with two hdfs:// URL can be made to work. There is indeed logic in o.a.h.ipc.Client to fall back to using simple authentication if your client config has security enabled (hadoop.security.authentication set to "kerberos") and the server responds with a response for simple authentication. Thing is, there are at least 3 bugs with this that I bumped into. All three can be worked around. > 1) If your client config has security enabled you *must* have a valid Kerberos TGT, even if you're interacting with an insecure cluster. The hadoop client unfortunately tries to read the local ticket cache before it tries to connect to the server, and so doesn't know that it won't need Kerberos credentials. > 2) Even though the destination NN is insecure, it has to have a Kerberos principal created for it. You don't need a keytab, and you don't need to change any settings on the destination NN. The principal just needs to exist in the principal database. This is again because the hadoop client will, before connecting to the remote NN, try to get a service ticket for the hdfs/f.q.d.n principal for the remote NN. If this fails, it won't even get to the part where it tries to connect to the insecure NN and falls back to simple auth. > 3) Once you get through problems 1 and 2, you will try to connect to the remote, insecure NN. This will work, but the reported principal name of your user will include a realm that the remote NN doesn't know about. You will either need to change the default_realm setting in /etc/krb5.conf on the insecure NN to be the same as the secure NN, or you will need to add some custom hadoop.security.auth_to_local mappings on the insecure NN so it knows how to translate this long principal name into a short name. > Even with all these changes, distcp still won't work since the first thing it tries to do when submitting the job is to get a delegation token for all the involved NNs, which won't work since the insecure NN isn't running a DT secret manager. I haven't been able to figure out a way around this, except to make a custom distcp which doesn't necessarily do this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira