hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2856) Fix block protocol so that Datanodes don't require root or jsvc
Date Fri, 27 Jun 2014 19:20:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14046317#comment-14046317

Chris Nauroth commented on HDFS-2856:

Jitendra, thank you for taking a look at this patch.

bq. ...it seems the encrypted key is obtained from namenode via rpc for every block...

Actually, we cache the encryption key so that we don't need to keep repeating that RPC.  (This
is true on current trunk and with my patch too.)  The key retrieval is now wrapped behind
the {{DataEncryptionKeyFactory}} interface.  There are 2 implementors of this: the {{DFSClient}}
itself and the {{NameNodeConnector}} used by the balancer.  In both of those classes, if you
look at the {{newDataEncryptionKey}} method, you'll see that they lazily fetch a key and cache
it for as long as the key expiry.

bq. getEncryptedStreams doesn't use access token. IMO the user and the password should be
derived from the accesstoken rather than the key.

Thanks for catching that.  This is a private method, so I can easily remove access token from
the signature.  We can't change the user/password calculation for the encrypted case now without
breaking compatibility.

bq. It might make sense to define the defaults for the new configuration variables in hdfs-default
and/or as constants. It helps in code reading at times.

The patch documents the new properties dfs.data.transfer.protection and dfs.data.transfer.saslproperties.resolver.class
in hdfs-default.xml.  The default values are set to empty/undefined.  I think this is what
we want, because it's an opt-in feature.  Let me know if you had any other configuration properties
in mind.

bq. Log.debug should be wrapped inside if (Log.isDebugEnabled()) condition.

The new classes use slf4j.  (There was some discussion on mailing lists a few months ago about
starting to use this library in new classes.)  With slf4j, it's no longer necessary to check
{{isDebugEnabled}}.  slf4j accepts string substitution variables using varargs, and it checks
the log level internally first before doing any string concatenation.  Explicitly checking
{{isDebugEnabled}} wouldn't provide any performance benefit.

bq. checkTrustAndSend obtains new encryption key, irrespective of the qop needed. I believe
the encryption key is needed only for specialized encryption case.

The 2 implementations of {{DataEncryptionKeyFactory}} mentioned above only retrieve an encryption
key if encryption is enabled (NameNode is configured with dfs.encrypt.data.transfer=true).
 For a deployment configured with SASL on DataTransferProtocol, this will be false, so it
won't actually get a key.  I'll put a comment in {{SaslDataTransferClient}} to clarify this.

bq. SaslDataTransferClient object in NameNodeConnector.java seems out of place, the NameNodeConnector
is supposed to encapsulate only namenode connections. Can we avoid the saslClient in this

Yeah, what was I thinking there?  :-)  This is needed by the balancer for its DataNode communication
when it needs to move blocks.  Let me see if I can move it right into the {{Balancer}} class.

bq. RemotePeerFactory.java: Javadoc needs update.

Will do.  Thanks for the catch.

bq. Minor nit: checkTrustAndSend returns null for skipping handshake which has to be checked
in the caller. It could just return the same stream pair, which otherwise every caller has
to do.

I actually need to use null as a sentinel value.  In {{peerSend}}, I need to know whether
or not a SASL handshake was performed, and if so, wrap the peer in an instance of {{EncryptedPeer}}
(which would be better named {{SaslPeer}} at this point, but we can refactor that later).
 If I returned a non-null {{IOStreamPair}} always, then I wouldn't be able to do this check.

I'll get to work on a new revision that incorporates your feedback.  Thanks again!

> Fix block protocol so that Datanodes don't require root or jsvc
> ---------------------------------------------------------------
>                 Key: HDFS-2856
>                 URL: https://issues.apache.org/jira/browse/HDFS-2856
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, security
>    Affects Versions: 3.0.0, 2.4.0
>            Reporter: Owen O'Malley
>            Assignee: Chris Nauroth
>         Attachments: Datanode-Security-Design.pdf, Datanode-Security-Design.pdf, Datanode-Security-Design.pdf,
HDFS-2856-Test-Plan-1.pdf, HDFS-2856.1.patch, HDFS-2856.2.patch, HDFS-2856.3.patch, HDFS-2856.4.patch,
HDFS-2856.5.patch, HDFS-2856.prototype.patch
> Since we send the block tokens unencrypted to the datanode, we currently start the datanode
as root using jsvc and get a secure (< 1024) port.
> If we have the datanode generate a nonce and send it on the connection and the sends
an hmac of the nonce back instead of the block token it won't reveal any secrets. Thus, we
wouldn't require a secure port and would not require root or jsvc.

This message was sent by Atlassian JIRA

View raw message