hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10768) Optimize Hadoop RPC encryption performance
Date Tue, 15 May 2018 15:34:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16476018#comment-16476018

Wei-Chiu Chuang commented on HADOOP-10768:

Here's my review comments. Some of them are already mentioned previously. [~dapengsun] would
you please update the patch if you have time available?

{quote}do you know why there's a ~50% degradation? That's concerning and may severely impede
performance (to the point I can't use it :() 
While that's the case, it is still 4~5x  performant than existing implementation, and I guess
I'm still happy about that. The amount of memory allocation is concerning. I'll try to find
time to profile it today.

 * CryptoInputStream
 the new readFully() method is a public method but lacks javadoc.
 is it different from the existing readFully() methods? Can you reuse the existing readFully()
 * Client
 Why are we catching exceptions in Client.shouldAuthenticateOverKrb()? That seems unnecessary.
If not catching exception causes a problem, please have a test case for it.
 * SaslUtil
 ** negotiateCipherOption()
 can you please throw a non IOException? I’m not in favor of making every method throwing
a generic IOExceptions. Similarly, update the method signature for the code path (getCipherOption,
 This method is very similar to DataTransferSaslUtil#negotiateCipherOption() except for the
configuration keys.
 I see no reason to duplicate the code, especially it involves some coding/decoding, which
is not that easy to comprehend.
 ** Also, every time this method is called, it returns a new List<>. I feel like this
is too much of a cost. Can we reduce the memory footprint?

 * SaslRpcClient
 ** saslConnect()
 "Get SASL RPC CipherOption from Conf" + cipherOptions);
 —> missing a space after Conf
 the method is well over 100 lines in length now. Some code refactor will greatly improve
 ** handleSaslCipherOptions()
 throw new SaslException(e.getMessage(), e);
 the exception message should be more descriptive. It could be something like “Unable to
initialize SaslCryptoCodec”

> Optimize Hadoop RPC encryption performance
> ------------------------------------------
>                 Key: HADOOP-10768
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10768
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: performance, security
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Yi Liu
>            Assignee: Dapeng Sun
>            Priority: Major
>         Attachments: HADOOP-10768.001.patch, HADOOP-10768.002.patch, HADOOP-10768.003.patch,
HADOOP-10768.004.patch, HADOOP-10768.005.patch, HADOOP-10768.006.patch, HADOOP-10768.007.patch,
HADOOP-10768.008.patch, HADOOP-10768.009.patch, Optimize Hadoop RPC encryption performance.pdf
> Hadoop RPC encryption is enabled by setting {{hadoop.rpc.protection}} to "privacy". It
utilized SASL {{GSSAPI}} and {{DIGEST-MD5}} mechanisms for secure authentication and data
protection. Even {{GSSAPI}} supports using AES, but without AES-NI support by default, so
the encryption is slow and will become bottleneck.
> After discuss with [~atm], [~tucu00] and [~umamaheswararao], we can do the same optimization
as in HDFS-6606. Use AES-NI with more than *20x* speedup.
> On the other hand, RPC message is small, but RPC is frequent and there may be lots of
RPC calls in one connection, we needs to setup benchmark to see real improvement and then
make a trade-off. 

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message