hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Krogen <ekro...@linkedin.com>
Subject Re: [DISCUSS] Hadoop RPC encryption performance improvements
Date Thu, 01 Nov 2018 21:29:03 GMT
Hey Wei-Chiu,


We (LinkedIn) are definitely interested in the progression of this feature. Surveying HADOOP-10768
vs. HADOOP-13836, we feel that HADOOP-10768 is a change that is more in-line with Hadoop's
progression. For example, it re-uses the existing SASL layer, maintains consistency with the
encryption used for data transfer, and avoids the necessity for setting up client key/trust
stores. Given that it is such a security-critical piece of code, I think we should make sure
to get some additional sets of eyes on the patch and ensure that all of Daryn's concerns are
addressed fully, but the approach seems valid.


Though we are interested in the Netty SSL approach, it is very difficult to make any judgements
on it at this time with such little information available. How fundamental of a code change
will this be? Is it fully backwards compatible? Will switching to a new RPC engine introduce
the possibility for a whole new range of performance issues and/or bugs? We can appreciate
the point that outsourcing such security-critical concerns to another widely used and battle-tested
framework could be a big potential benefit, but are worried about the associated risks. More
detailed information may help to assuage these concerns.


One additional point we would like to make is that right now, it seems that different approaches
are using different benchmarks. For example, HADOOP-13836 posted results from Terasort, and
HADOOP-10768 posted results from RPCCallBenchmark. Clearly the performance of the approach
is crucial in making the decision and we should ensure that any comparisons made are apples-to-apples
with the same test setup.

Thanks,
Erik Krogen
LinkedIn

________________________________
From: Wei-Chiu Chuang <weichiu@apache.org>
Sent: Wednesday, October 31, 2018 6:43 AM
To: Hadoop Common; Hdfs-dev
Subject: Re: [DISCUSS] Hadoop RPC encryption performance improvements

Ping. Any one? Cloudera is interested in moving forward with the RPC
encryption improvements, but I just like to get a consensus which approach
to go with.

Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've
spent time on testing it.

On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang <weichiu@apache.org> wrote:

> Folks,
>
> I would like to invite all to discuss the various Hadoop RPC encryption
> performance improvements. As you probably know, Hadoop RPC encryption
> currently relies on Java SASL, and have _really_ bad performance (in terms
> of number of RPCs per second, around 15~20% of the one without SASL)
>
> There have been some attempts to address this, most notably, HADOOP-10768
> <https://issues.apache.org/jira/browse/HADOOP-10768> (Optimize Hadoop RPC
> encryption performance) and HADOOP-13836
> <https://issues.apache.org/jira/browse/HADOOP-13836> (Securing Hadoop RPC
> using SSL). But it looks like both attempts have not been progressing.
>
> During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's
> working on another approach that leverages Netty for its SSL encryption,
> and then integrate Netty with Hadoop RPC so that Hadoop RPC automatically
> benefits from netty's SSL encryption performance.
>
> So there are at least 3 attempts to address this issue as I see it. Do we
> have a consensus that:
> 1. this is an important problem
> 2. which approach we want to move forward with
>
> --
> A very happy Hadoop contributor
>


--
A very happy Hadoop contributor

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message