hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yi Liu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-10150) Hadoop cryptographic file system
Date Thu, 12 Dec 2013 05:29:07 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-10150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13846078#comment-13846078

Yi Liu commented on HADOOP-10150:

Larry, the patch attached to HADOOP-10156 as subtask of HADOOP-10150 is a pure java implementation
without any external dependencies. The first patch we put up did contain hadoop-crypto, a
crypto codec framework which includes some non-Java code implemented using C. However, the
latest patch on HADOOP-10156 instead provides ciphers using the standard javax.security.Cipher
interface, and cipher implementations that are shipped with the JRE by default, instead of
hadoop-crypto. Java by itself provides the mechanism to allow supplement Cipher implementations,
the JCE (Java Cryptography Extension). 

Because the default JCE provider shipped with common JREs do not utilize hardware acceleration
(AES-NI) that has been available for years, we have also developed a pure open source Apache
2 licensed JCE provider named Diceros to mitigate the performance penalties. Our initial tests
shows 20x improvement over ciphers shipped with JRE 7. We would like to contribute Diceros
also, but to simply review for now we are hosting Diceros on GitHub. The code submitted for
HADOOP-10156 allows the end user to configure any kind of JCE provider - for example, it can
be the default JCE provider shipped with JREs, Diceros ("DC") or BouncyCastle ("BC"). Please
let me know if you have any other concerns about this approach. Thanks.

> Hadoop cryptographic file system
> --------------------------------
>                 Key: HADOOP-10150
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10150
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: security
>    Affects Versions: 3.0.0
>            Reporter: Yi Liu
>            Assignee: Yi Liu
>              Labels: rhino
>             Fix For: 3.0.0
>         Attachments: CryptographicFileSystem.patch, HADOOP cryptographic file system.pdf
> There is an increasing need for securing data when Hadoop customers use various upper
layer applications, such as Map-Reduce, Hive, Pig, HBase and so on.
> HADOOP CFS (HADOOP Cryptographic File System) is used to secure data, based on HADOOP
“FilterFileSystem” decorating DFS or other file systems, and transparent to upper layer
applications. It’s configurable, scalable and fast.
> High level requirements:
> 1.	Transparent to and no modification required for upper layer applications.
> 2.	“Seek”, “PositionedReadable” are supported for input stream of CFS if the
wrapped file system supports them.
> 3.	Very high performance for encryption and decryption, they will not become bottleneck.
> 4.	Can decorate HDFS and all other file systems in Hadoop, and will not modify existing
structure of file system, such as namenode and datanode structure if the wrapped file system
is HDFS.
> 5.	Admin can configure encryption policies, such as which directory will be encrypted.
> 6.	A robust key management framework.
> 7.	Support Pread and append operations if the wrapped file system supports them.

This message was sent by Atlassian JIRA

View raw message