hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zheng, Kai" <kai.zh...@intel.com>
Subject RE: Introduce Apache Kerby to Hadoop
Date Sun, 28 Feb 2016 04:33:02 GMT
Thanks Andrew for the update on HBase side!

>> Throughput drops 3-4x, or worse.
Hopefully we can avoid much of the encryption overhead. We're prototyping a solution working
on that.


-----Original Message-----
From: Andrew Purtell [mailto:andrew.purtell@gmail.com] 
Sent: Saturday, February 27, 2016 5:35 PM
To: common-dev@hadoop.apache.org
Subject: Re: Introduce Apache Kerby to Hadoop

I get a excited thinking about the prospect of better performance with auth-conf QoP. HBase
RPC is an increasingly distant fork but still close enough to Hadoop in that respect. Our
bulk data transfer protocol isn't a separate thing like in HDFS, which avoids a SASL wrapped
implementation, so we really suffer when auth-conf is negotiated. You'll see the same impact
where there might be a high frequency of NameNode RPC calls or similar still. Throughput drops
3-4x, or worse. 

> On Feb 22, 2016, at 4:56 PM, Zheng, Kai <kai.zheng@intel.com> wrote:
> Thanks for the confirm and further inputs, Steve. 
>>> the latter would dramatically reduce the cost of wire-encrypting IPC.
> Yes to optimize Hadoop IPC/RPC encryption is another opportunity Kerby can help with,
it's possible because we may hook Chimera or AES-NI thing into the Kerberos layer by leveraging
the Kerberos library. As it may be noted, HADOOP-12725 is on the going for this aspect. There
may be good result and further update on this recently.
>>> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it
> Yes, starting with this initial steps upgrading MiniKDC to use Kerby is the right thing
we could do. After some interactions with Kerby project, we may have more ideas how to proceed
on the followings.
>>> Long term, I'd like Hadoop 3 to be Kerby-ized
> This sounds great! With necessary support from the community like feedback and patch
reviewing, we can speed up the related work.
> Regards,
> Kai
> -----Original Message-----
> From: Steve Loughran [mailto:stevel@hortonworks.com]
> Sent: Monday, February 22, 2016 6:51 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: Introduce Apache Kerby to Hadoop
> I've discussed this offline with Kai, as part of the "let's fix kerberos" project. Not
only is it a better Kerberos engine, we can do more diagnostics, get better algorithms and
ultimately get better APIs for doing Kerberos and SASL —the latter would dramatically reduce
the cost of wire-encrypting IPC.
> For now, I'd like to see basic steps -upgrading minkdc to krypto, see how it works.
> Long term, I'd like Hadoop 3 to be Kerby-ized
>> On 22 Feb 2016, at 06:41, Zheng, Kai <kai.zheng@intel.com> wrote:
>> Hi folks,
>> I'd like to mention Apache Kerby [1] here to the community and propose to introduce
the project to Hadoop, a sub project of Apache Directory project.
>> Apache Kerby is a Kerberos centric project and aims to provide a first Java Kerberos
library that contains both client and server supports. The relevant features include:
>> It supports full Kerberos encryption types aligned with both MIT KDC 
>> and MS AD; Client APIs to allow to login via password, credential 
>> cache, keytab file and etc.; Utilities for generate, operate and 
>> inspect keytab and credential cache files; A simple KDC server that 
>> borrows some ideas from Hadoop-MiniKDC and can be used in tests but 
>> with minimal overhead in external dependencies; A brand new token mechanism is provided,
can be experimentally used, using it a JWT token can be used to exchange a TGT or service
ticket; Anonymous PKINIT support, can be experientially used, as the first Java library that
supports the Kerberos major extension.
>> The project stands alone and is ensured to only depend on JRE for easier usage. It
has made the first release (1.0.0-RC1) and 2nd release (RC2) is upcoming.
>> As an initial step, this proposal suggests using Apache Kerby to upgrade the existing
codes related to ApacheDS for the Kerberos support. The advantageous:
>> 1. The kerby-kerb library is all the need, which is purely in Java, 
>> SLF4J is the only dependency, the whole is rather small;
>> 2. There is a SimpleKDC in the library for test usage, which borrowed 
>> the MiniKDC idea and implemented all the support existing in MiniKDC.
>> We had a POC that rewrote MiniKDC using Kerby SimpleKDC and it works 
>> fine;
>> 3. Full Kerberos encryption types (many of them are not available in 
>> JRE but supported by major Kerberos vendors) and more functionalities 
>> like credential cache support;
>> 4. Perhaps the most concerned, Hadoop MiniKDC and etc. depend on the 
>> old Kerberos implementation in Directory Server project, but the 
>> implementation is stopped being maintained. Directory project has a 
>> plan to replace the implementation using Kerby. MiniKDC can use Kerby 
>> directly to simplify the deps;
>> 5. Extensively tested with all kinds of unit tests, already being 
>> used for some time (like PSU), even in production environment;
>> 6. Actively developed, and can be fixed and released in time if necessary, separately
and independently from other components in Apache Directory project. By actively developing
Apache Kerby and now applying it to Hadoop, our side wish to make the Kerberos deploying,
troubleshooting and further enhancement can  be much easier and thereafter possible.
>> Wish this is a good beginning, and eventually Apache Kerby can benefit other projects
in the ecosystem as well.
>> This Kerberos related work is actually a long time effort led by Weihua Jiang in
Intel, and had been kindly encouraged by Andrew Purtell, Steve Loughran, Gangumalla Uma, Andrew
Wang and etc., thanks a lot for their great discussions and inputs in the past.
>> Your feedback is very welcome. Thanks in advance.
>> [1] https://github.com/apache/directory-kerby
>> Regards,
>> Kai
View raw message