hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.
Date Thu, 09 Mar 2017 22:06:38 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15903942#comment-15903942

Andrew Wang commented on HADOOP-14104:

Here's my review on v2, along with responses to some of Yongjun's questions. Thanks for working
on this Rushabh!

bq. in createWrappedOutputStream(, where conf is the configuration of local cluster. There
is a possibility that the local configuration is different than remote cluster's. So it's
possible to fail here.

The conf is only used to configure the CryptoCodec, which does not need a KMS URI. I think
this is okay, all the CC parameters are included in the {{feInfo}}.

bq. @awang would you please confirm if it's ok to do so since this class is public), and use
this constant at multiple places that current uses ""

Yea it's fine. I would like to improve this in this patch if possible, since it removes redundancies.

bq. 3. Notice that "dfs.encryption.key.provider.uri" is deprecated and replaced with hadoop.security.key.provider.path
(see HDFS-10489). So suggest to replace variable name keyProviderUri with keyProviderPath

I think we can ignore this for now like Rushabh said, can handle it in another JIRA if necessary.

bq. Seems we need a similar change in WebHdfsFileSystem when calling addDelegationTokens

The DN does the encryption/decryption in WebHDFS, so the client doesn't need to do any KMS

It does bring up a question regarding the DN DFSClient though. It looks like WebHdfsHandler
creates a new DFSClient each time, which means we won't benefit from getServerDefaults caching.

Is the fix to make config preferred over getServerDefaults?

bq. <StandbyException>

Does this exception actually make it to clients? The HA RPC proxy normally catches the StandbyException
and fails over to the other NN. We can write a unit test for this to verify if we're unsure.

My additional review comments:

* typo nameonde
* ServerDefaults#getKeyProviderUri needs javadoc explaining how to interpret null and empty
(IIUC null means not set, empty means means set but not enabled)
* In docs, "DFSClients" is an internal name. Please rename to say "HDFS clients" or similar.
Same for "dfs clients" in core-default.xml
* There are a lot of whitespace errors, please take a look at what's flagged by checkstyle.
Recommend using IDE autoformatting in the future.
* An actual mini-cluster test that mimics an MR job submitter and task's call pattern would
also be good.

* Would like to see the test reversed so it covers the fallback behavior, i.e.
    * set client config with kp1, check that it returns kp1
    * mock getServerDefaults() with kp2, check it returns kp2
    * set Credentials with kp3, check it returns kp3
* typo originalNameodeUri
* {{String lookUpKey = DFSClient.DFS_KMS_PREFIX + originalNameodeUri.toString();}} should
this be a {{getKey}} helper method in DFSClient rather than having the test code also construct
the key?

> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>                 Key: HADOOP-14104
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14104
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch, HADOOP-14104-trunk-v2.patch,
> According to current implementation of kms provider in client conf, there can only be
one kms.
> In multi-cluster environment, if a client is reading encrypted data from multiple clusters
it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message