hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10433) Key Management Server based on KeyProvider API
Date Tue, 29 Apr 2014 07:50:22 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13984101#comment-13984101
] 

Andrew Wang commented on HADOOP-10433:
--------------------------------------

Thanks for the rev here Tucu, I can tell this wasn't a small effort. Overall it looks real
nice, the new REST API is fine by me. I have some more review comments though, but expect
to +1 after this round (pending agreement by reviewers on the REST API details).

Running the KMS:
* Overall a more pleasant experience, tarball builds, audit logging has newlines, I didn't
see stacktraces when running commands.
* I know I asked for less spew in kms.sh, but defaulting KMS_SILENT to true is perhaps too
little. Without it, running just "./sbin/kms.sh" shows nothing, not even help text, and if
start/stop hit an error, we again see nothing (have to know where the logs are and hope it
has the error there). This isn't a biggie, but maybe we can improve it a little more.

JSON:
* I see KMSMetadata and KMSKeyVersion classes, but they don't look like they do much. I'm
guessing this is from an aborted attempt to do POJO<->JSON? That'd would be awesome,
but I won't get too excited.
* If you stick with the JSON helper methods, could we pull them all out into a separate helper
class and make them all static, along with the field constants? Feels weird to have the Server
digging into ClientProvider for field names, when theoretically a different client implementation
(REST!) could be interfacing with the KMS.

KMSServer and KMSClientProvider:
* Noticed that KMSServer#getKeyMetadata includes the name, but it's unused on the client side,
nor part of KeyProvider#Metadata.
* I think we still miss audit logging if an exception is thrown. e.g. in #rolloverKey, getPrincipal,
checkNotEmpty, provider.rollNewVersion can all throw something. This is why a try/finally
might be easiest.
* Can make many methods static: validateResponse, call, all the JSON functions, getPrincipal,
assertAccess, removeKeyMaterial, getKeyURI
* On the NoSuchAlgorithmException, #validateResponse casts everything to an IOException, so
I don't think anything else comes out of this function. Not sure this cast is kosher. I think
the catch for Exception there also needs to wrap the created String into an exception and
throw it.
* Given the discussion about UUIDs as versions, should we be exposing KMSClientProvider#buildVersionName
publicly? This seems like an implementation detail of the backing KeyProvider.

Misc things:
* I think a rogue regex changed GET to GET_OP across the entire hadoop repo, which is why
we're picking up all these test changes, and quite possibly why we have some weird test failures.
* KMSACLs, the new comment about in #run probably should go in #loadACLs, or be removed.
* KeyNotFoundException serialVersionUID, this is something Eclipse is warning about, so it'd
be nice to just add it.
* In the cache provider, #getCurrentKey and getKeyVersion, could set the cause once in a temp
var to avoid having to call getCause() repeatedly.
* KMSServer semantically should still be renamed to KMServer or just KMS

> Key Management Server based on KeyProvider API
> ----------------------------------------------
>
>                 Key: HADOOP-10433
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10433
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: security
>    Affects Versions: 3.0.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch,
HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HADOOP-10433.patch, HadoopKMSDocsv2.pdf,
KMS-doc.pdf
>
>
> (from HDFS-6134 proposal)
> Hadoop KMS is the gateway, for Hadoop and Hadoop clients, to the underlying KMS. It provides
an interface that works with existing Hadoop security components (authenticatication, confidentiality).
> Hadoop KMS will be implemented leveraging the work being done in HADOOP-10141 and HADOOP-10177.
> Hadoop KMS will provide an additional implementation of the Hadoop KeyProvider class.
This implementation will be a client-server implementation.
> The client-server protocol will be secure:
> * Kerberos HTTP SPNEGO (authentication)
> * HTTPS for transport (confidentiality and integrity)
> * Hadoop ACLs (authorization)
> The Hadoop KMS implementation will not provide additional ACL to access encrypted files.
For sophisticated access control requirements, HDFS ACLs (HDFS-4685) should be used.
> Basic key administration will be supported by the Hadoop KMS via the, already available,
Hadoop KeyShell command line tool
> There are minor changes that must be done in Hadoop KeyProvider functionality:
> The KeyProvider contract, and the existing implementations, must be thread-safe
> KeyProvider API should have an API to generate the key material internally
> JavaKeyStoreProvider should use, if present, a password provided via configuration
> KeyProvider Option and Metadata should include a label (for easier cross-referencing)
> To avoid overloading the underlying KeyProvider implementation, the Hadoop KMS will cache
keys using a TTL policy.
> Scalability and High Availability of the Hadoop KMS can achieved by running multiple
instances behind a VIP/Load-Balancer. For High Availability, the underlying KeyProvider implementation
used by the Hadoop KMS must be High Available.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message