hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14104) Client should always ask namenode for kms provider path.
Date Thu, 02 Mar 2017 07:34:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891749#comment-15891749
] 

Andrew Wang commented on HADOOP-14104:
--------------------------------------

Thanks for reviewing Wei-chiu and Yongjun, let me reply to a few comments of Rushabh:

bq. This is not a precise statement though, considering that multiple kms instances can be
added for load balancing purposes. Would you mind to update the release note once this patch
gets committed?

By this, I think you mean we should update the JIRA summary and related to say "for the KeyProvider
URI"?

bq. Can we also add a test to ensure clients can access files in an encrypted remote cluster
using the token obtained from the remote NameNode?

What constitutes "remote" here? Clients fetch delegation tokens based on the specified filesystem(s),
so I don't think it matters where the client or cluster are located.

bq. 5. Currently getServerDefaults() contact NN every hour, to find if there is any update
of keyprovider. If keyprovider changed within the hour, client code may get into exception,
wonder if we have mechanism to handle the exception and update the keyprovider and try again?

What kind of error handling do you recommend here? I do think at least some a log message
would be nice.

Some new questions of my own:

* I like that getServerDefaults is lock-free, but I'm still worried about the overhead. MR
tasks are short lived and thus don't benefit from the caching. This also affects all clients,
on both encrypted and unencrypted clusters. I think getServerDefault is also currently only
called when SASL is enabled. Have you done any performance testing of this RPC?
* Another option is to put this behind a config key which defaults off, and file a new JIRA
to track defaulting it on in a later release.

> Client should always ask namenode for kms provider path.
> --------------------------------------------------------
>
>                 Key: HADOOP-14104
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14104
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: kms
>            Reporter: Rushabh S Shah
>            Assignee: Rushabh S Shah
>         Attachments: HADOOP-14104-trunk.patch, HADOOP-14104-trunk-v1.patch
>
>
> According to current implementation of kms provider in client conf, there can only be
one kms.
> In multi-cluster environment, if a client is reading encrypted data from multiple clusters
it will only get kms token for local cluster.
> Not sure whether the target version is correct or not.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message