hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11908) libhdfs++: Authentication failure when first NN of kerberized HA cluster is standby
Date Wed, 31 May 2017 21:01:05 GMT
James Clampffer created HDFS-11908:

             Summary: libhdfs++: Authentication failure when first NN of kerberized HA cluster
is standby
                 Key: HDFS-11908
                 URL: https://issues.apache.org/jira/browse/HDFS-11908
             Project: Hadoop HDFS
          Issue Type: Sub-task
            Reporter: James Clampffer
            Assignee: James Clampffer

Library won't properly authenticate to kerberized HA cluster if the first namenode it tries
to connect to is the standby.  RpcConnection ends up attempting to use simple auth.

Control flow to connect to NN for the first time:
# RpcConnection constructed with a pointer to the RpcEngine as the only argument
# RpcConnection::Connect(server endpoints, auth_info, callback called)
** auth_info contains the SASL mechanism to use + the delegation token if we already have

Control flow to connect to NN after failover:
# RpcEngine::NewConnection called, allocates an RpcConnection exactly how step 1 above would
# RpcEngine::InitializeConnection called, sets event hooks and a string for cluster name
# Rpc calls sent using RpcConnection::PreEnqueueRequests called to add RPC message that didn't
make it on last call due to standby exception
# RpcConnection::ConnectAndFlush called to send RPC packets. This only takes server endpoints,
no auth info

To fix:
RpcEngine::InitializeConnection just needs to set RpcConnection::auth_info_ from the existing
RpcEngine::auth_info_, even better would be setting this in the constructor so if an RpcConnection
exists it can be expected to be in a usable state.  I'll get a diff up once I sort out CI
build failures.

Also really need to get CI test coverage for HA and kerberos because this issue should not
have been around for so long.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org

View raw message