spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang_intel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-5682) Add encrypted shuffle in spark
Date Thu, 02 Jul 2015 05:24:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-5682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14611491#comment-14611491
] 

liyunzhang_intel commented on SPARK-5682:
-----------------------------------------

[~hujiayin]:  thanks for your comment
{quote}
The solution relied on hadoop API and maybe downgrade the performance. 
{quote}
For The solution relied on hadoop API: You mean i use org.apache.hadoop.io.Text in [CommonConfigurationKeys
|https://github.com/apache/spark/pull/4491/files#diff-a76c55d0e8f2e4e1a6cb5848826585fe]. 

But i have different idea for this:
{code}
@Stringable
@InterfaceAudience.Public
@InterfaceStability.Stable
public class Text extends BinaryComparable
org.apache.hadoop.io.Text  
{code}

it shows that org.apache.hadoop.io.Text  is stable which means the interfaces it provides
will be not changed a lot in the  later release.

For downgrade the performance: have you any test results to show this?
 



> Add encrypted shuffle in spark
> ------------------------------
>
>                 Key: SPARK-5682
>                 URL: https://issues.apache.org/jira/browse/SPARK-5682
>             Project: Spark
>          Issue Type: New Feature
>          Components: Shuffle
>            Reporter: liyunzhang_intel
>         Attachments: Design Document of Encrypted Spark Shuffle_20150209.docx, Design
Document of Encrypted Spark Shuffle_20150318.docx, Design Document of Encrypted Spark Shuffle_20150402.docx,
Design Document of Encrypted Spark Shuffle_20150506.docx
>
>
> Encrypted shuffle is enabled in hadoop 2.6 which make the process of shuffle data safer.
This feature is necessary in spark. AES  is a specification for the encryption of electronic
data. There are 5 common modes in AES. CTR is one of the modes. We use two codec JceAesCtrCryptoCodec
and OpensslAesCtrCryptoCodec to enable spark encrypted shuffle which is also used in hadoop
encrypted shuffle. JceAesCtrypoCodec uses encrypted algorithms  jdk provides while OpensslAesCtrCryptoCodec
uses encrypted algorithms  openssl provides. 
> Because ugi credential info is used in the process of encrypted shuffle, we first enable
encrypted shuffle on spark-on-yarn framework.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message