hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sanjay Radia (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-9880) RPC Server should not unconditionally create SaslServer with Token auth.
Date Fri, 16 Aug 2013 01:06:48 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741771#comment-13741771

Sanjay Radia commented on HADOOP-9880:

We see exactly the same error during a test this morning.
The 2 Jiras that  caused this problem are the recent HADOOP-9421 and the earlier HDFS-3083.

HADOOP-9421 improved SASL protocol.
ZKFC uses Kerberos. But the server-side initiates the token-based challenge just in case the
client wants token. As part of doing that the server does  secretManager.checkAvailableForRead()
 fails because the NN is in standby. 

It is really bizzare that there is check for the server's state (active or standby) as part
of SASL. This was introduced in HDFS-3083 to deal with a failover bug. In HDFS-3083, Aaron
noted that he does not like the solution: "I'm not in love with this solution, as it leaks
abstractions all over the place,". The abstraction layer violation finally caught up with

Turns out even prior to Dary's HADOOP-9421 a similar problem could have occurred if the ZKFC
had used Kerberos for first connection and Tokens for any subsequent connections.

An immediate fix is required to fix what HADOOP-9421 broke but I believe we need to also fix
the fix that HDFS-3083 introduced - the abstraction layer violations need to be cleaned up.
> RPC Server should not unconditionally create SaslServer with Token auth.
> ------------------------------------------------------------------------
>                 Key: HADOOP-9880
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9880
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.1.0-beta
>            Reporter: Kihwal Lee
>            Priority: Blocker
> buildSaslNegotiateResponse() will create a SaslRpcServer with TOKEN auth. When create()
is called against it, secretManager.checkAvailableForRead() is called, which fails in HA standby.
Thus HA standby nodes cannot be transitioned to active.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message