hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jerry Chen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-5025) Key Distribution and Management for supporting crypto codec in Map Reduce
Date Mon, 18 Mar 2013 07:58:16 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-5025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13604928#comment-13604928

Jerry Chen commented on MAPREDUCE-5025:

Encryption codecs and compression codecs work basically the same way, as block or stream transforms,
and therefore encryption can mostly masquerade as compression. We think this will make encryption
features easy to work with. Consider how compression codecs plug in today at the file level,
and in MapReduce, as options to SequenceFile, etc. Dropping crypto codecs into these existing
plug points would seem to introduce the least risk and change to existing code and applications.
Of course, unlike compression algorithms crypto algorithms cannot properly transform input
without initialization with key material. We do need to propagate that bit of extra state
securely to where user code executes during the MapReduce job. 

A transparent encrypted file system would be a good feature. That is something we should discuss
on another JIRA? "EncryptingFilterFileSystem"? As a global filter on a MapReduce job we think
it heavyweight and limiting. It would complicate configuration and code if some jobs need
an untranslated view to plain files and a translated view to encrypted ones simultaneously,
which we think will be the common case. Some input and therefore output files will be sensitive
and require encryption, but others will not, and encryption introduces costs, so we imagine
jobs would optimize its use.

> Key Distribution and Management for supporting crypto codec in Map Reduce
> -------------------------------------------------------------------------
>                 Key: MAPREDUCE-5025
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5025
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: security
>    Affects Versions: trunk
>            Reporter: Jerry Chen
>            Assignee: Jerry Chen
>         Attachments: MAPREDUCE-5025.patch
>   Original Estimate: 504h
>  Remaining Estimate: 504h
> This task defines the work to enable Map Reduce to utilize the Crypto Codec framework
to support encryption and decryption of data during MapReduce Job.
> According to the some real use case and discussions from the community, for encryption
and decryption files in Map Reduce, we have the following requirements:
>   1. Different stages (input, output, intermediate output) should have the flexibility
to choose whether encrypt or not, as well as which crypto codec to use.
>   2. Different stages may have different scheme of providing the keys.
>   3. Different Files (for example, different input files) may have or use different keys.

>   4. Support a flexible way of retrieving keys for encryption or decryption.
> So this task defines and provides the framework for supporting these requirements as
well as the implementations for common use and key retrieving scenarios.
> The design document of this part is included in the Hadoop Crypto Design attached in

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message