spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ferdinand Xu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5682) Add encrypted shuffle in spark
Date Wed, 11 Nov 2015 15:34:11 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15000526#comment-15000526
] 

Ferdinand Xu commented on SPARK-5682:
-------------------------------------

Thank you for your question. The key is generated by key gen which is instanced by specified
keygen algorithm. The part of work is available in the method CryptoConf#initSparkShuffleCredentials.
More detailed information is available in the PR(https://github.com/apache/spark/pull/8880).
And for the IV part, we are using Chimera(https://github.com/intel-hadoop/chimera) as an external
library in the latest PR(https://github.com/intel-hadoop/chimera/blob/master/src/main/java/com/intel/chimera/JceAesCtrCryptoCodec.java#L70
and https://github.com/intel-hadoop/chimera/blob/master/src/main/java/com/intel/chimera/OpensslAesCtrCryptoCodec.java#L81).
You can also deep into the code about how IV is calculated by counter and initial IV(https://github.com/intel-hadoop/chimera/blob/master/src/main/java/com/intel/chimera/AesCtrCryptoCodec.java#L42).
The initial IV is generated by security random.

> Add encrypted shuffle in spark
> ------------------------------
>
>                 Key: SPARK-5682
>                 URL: https://issues.apache.org/jira/browse/SPARK-5682
>             Project: Spark
>          Issue Type: New Feature
>          Components: Shuffle
>            Reporter: liyunzhang_intel
>         Attachments: Design Document of Encrypted Spark Shuffle_20150209.docx, Design
Document of Encrypted Spark Shuffle_20150318.docx, Design Document of Encrypted Spark Shuffle_20150402.docx,
Design Document of Encrypted Spark Shuffle_20150506.docx
>
>
> Encrypted shuffle is enabled in hadoop 2.6 which make the process of shuffle data safer.
This feature is necessary in spark. AES  is a specification for the encryption of electronic
data. There are 5 common modes in AES. CTR is one of the modes. We use two codec JceAesCtrCryptoCodec
and OpensslAesCtrCryptoCodec to enable spark encrypted shuffle which is also used in hadoop
encrypted shuffle. JceAesCtrypoCodec uses encrypted algorithms  jdk provides while OpensslAesCtrCryptoCodec
uses encrypted algorithms  openssl provides. 
> Because ugi credential info is used in the process of encrypted shuffle, we first enable
encrypted shuffle on spark-on-yarn framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message