hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benoy Antony (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-4491) Encryption and Key Protection
Date Fri, 07 Sep 2012 22:12:09 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451050#comment-13451050
] 

Benoy Antony commented on MAPREDUCE-4491:
-----------------------------------------

Key Protection is simple to explain.
JobClient retrieves keys from a configured Keystore ,encrypts the keys along with jobId  using
cluster public key , submits the encrypted blob 
as part of the job credentials. 
TaskTrackers decrypts the encrypted blob using cluster private key during job localization,
verifies that jobId inside the encrypted blob matches the JobId of the task. During Task Launch,
the keys are made available to the  child (task) process as an environment variable.

Since the JobId is part of the encrypted blob, the replay attack is prevented with the JobId
verification. It is easy to add integrity protection also.

Now, the scheme was designed to be used in a secure cluster. It is good to explore whether
it can be used in a non-secure cluster. 

One issue was with the cluster private key. It should be made accessible only to TaskTracker
process. If the access is determined by the user's permissions, then tasks should be run as
a different user. But it need not be the job owner. It can be a fixed user. 

I believe , you are bringing up another issue in this regard.  
If a rogue task can  make a TT launch another rogue task with a jobId matching the one inside
encrypted blob, then the keys area available to the newly launched rogue task.
That's a good point. Basically the rogue task is acting as a JT/AppMaster. I am not sure whether
that is possible. Even if its possible, there should be ways to detect it. 




                
> Encryption and Key Protection
> -----------------------------
>
>                 Key: MAPREDUCE-4491
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4491
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: documentation, security, task-controller, tasktracker
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>         Attachments: Hadoop_Encryption.pdf, Hadoop_Encryption.pdf
>
>
> When dealing with sensitive data, it is required to keep the data encrypted wherever
it is stored. Common use case is to pull encrypted data out of a datasource and store in HDFS
for analysis. The keys are stored in an external keystore. 
> The feature adds a customizable framework to integrate different types of keystores,
support for Java KeyStore, read keys from keystores, and transport keys from JobClient to
Tasks.
> The feature adds PGP encryption as a codec and additional utilities to perform encryption
related steps.
> The design document is attached. It explains the requirement, design and use cases.
> Kindly review and comment. Collaboration is very much welcome.
> I have a tested patch for this for 1.1 and will upload it soon as an initial work for
further refinement.
> Update: The patches are uploaded to subtasks. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message