hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sanjay Radia <san...@hortonworks.com>
Subject Re: [VOTE] Merge fs-encryption branch to trunk
Date Fri, 15 Aug 2014 22:13:29 GMT

+1 (binding)
We have made some great progress in the last few days on some of the issues I raised.
I have posted a summary of the followup items that are needed on the Jira today.
I am +1ing expecting the team will  complete Items 1 (distcp/cp) and 2 (webhdfs)  promptly.
Before we publish transparent encryption in a 2.x release for pubic consumption, let us at
least complete item 1 (ie distcp and cp) and the flag to turn this feature on/of.

This is a great work; thanks team for contributing this important feature.


On Aug 14, 2014, at 1:05 AM, sanjay Radia <sanjay@hortonworks.com> wrote:

> While I was originally skeptical of transparent encryption, I like the value proposition
of transparent encryption. HDFS has several layers, protocols  and tools. While the HDFS core
part seems to be well done in the Jira, inserting the matching transparency in the other tools
or protocols need to be worked through.
> I have the following areas of concern:
> - Common protocols like webhdfs should continue to work (the design doc marks this as
a goal), This issue is being discussed in the Jira but it appears that webhdfs does not currently
work with encrypted files: Andrew say that "Regarding webhdfs, it's not a recommended deployment"
and that he will modify the documentation to match that. Aljeandro say "Both httpfs and webhdfs
will work just fine" but then in the same paragraph says "this could fail some security audits".
We need to resolve this quickly. Webhdfs is heavily used by many Hadoop users.
> - Common tools should like cp, distcp and HAR should continue  to work with non-encrypted
and encrypted files in an automatic fashion. This issue has been heavily discussed in the
Jira and at the meeting. The /.reserved./.raw mechanism appears to be a step in the right
direction for distcp and cp, however this work has not reached its conclusion in my opinion;
Charles are I are going through the use cases and I think we are close to a clean solution
for distcp and cp.  HAR still needs a concrete proposal.
> - KMS scalability in medium to large clusters. This can perhaps  be addressed by getting
the keys ahead of time when a job is submitted.  Without this the  KMS will need to be as
highly available and scalable as the NN.  I think this is future implementation work but we
need to at least determine if this is indeed possible in case we need to modify some of the
APIs right now to support that.
> There are some other minor things under discussion, and I still need to go through the
new APIs.
> Unfortunately at this stage I cannot give a +1 for this merge; I hope to change this
in the next day or -  I am working with the Jira's team.  Alejandoro, Charles, Andrew, Atm,
...  to resolve the above as quickly as possible.
> Sanjay (binding)
> On Aug 8, 2014, at 11:45 AM, Andrew Wang <andrew.wang@cloudera.com> wrote:
>> Hi all,
>> I'd like to call a vote to merge the fs-encryption branch to trunk.
>> Development of this feature has been ongoing since March on HDFS-6134 and
>> HADOOP-10150, totally approximately 50 commits.
>> .....
>> Thanks,
>> Andrew

NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.

View raw message