hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kai Zheng (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance
Date Fri, 22 Apr 2016 20:32:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254603#comment-15254603

Kai Zheng commented on HADOOP-10768:

Thanks for the design doc and clarifying. It looks good work, [~dian.fu]!

Comments about the doc:
* It would be good to clearly say: this builds application layer data encryption *ABOVE* SASL
(not mixed or not in the same layer of SASL). So accordingly, you can simplify your flow picture
very much, by reducing it into only two steps: 1) SASL handshake; 2) Hadoop data encryption
cipher negotiation. The illustrated 7 steps for SASL may be specific to GSSAPI, for others
it may be much simpler, anyhow we don't need to show it here.
* Why need to have {{SaslCryptoCodec}}? What it does? Maybe after separate encryption negotiation
is complete, we can create CryptoOutputStream directly?
* Since we're going in the same approach with data transfer encryption, both doing separate
encryption cipher negotiation and data encryption after and above SASL, one being for file
data, the other for RPC data, maybe we can mostly reuse the existing work? Did we go this
way in implementation or is there any difference?
* How the encryption key(s) is negotiated or determined? Do it consider the established session
key from SASL if available? It seems to produce a key pair and how the two keys are used?
* Do we hard-code the AES cipher to be AES/CTR mode? Guess other mode like AES/GCM can also
be used.

> Optimize Hadoop RPC encryption performance
> ------------------------------------------
>                 Key: HADOOP-10768
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10768
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance, security
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Dian Fu
>         Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, Optimize Hadoop
RPC encryption performance.pdf
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to "privacy". It
utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for secure authentication and data
protection. Even {{GSSAPI}} supports using AES, but without AES-NI support by default, so
the encryption is slow and will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the same optimization
as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be lots of
RPC calls in one connection, we needs to setup benchmark to see real improvement and then
make a trade-off. 

This message was sent by Atlassian JIRA

View raw message