pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "liyunzhang_intel (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-5176) Several ComputeSpec test cases fail
Date Tue, 11 Apr 2017 08:44:41 GMT

    [ https://issues.apache.org/jira/browse/PIG-5176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964023#comment-15964023

liyunzhang_intel commented on PIG-5176:

[~nkollar]: if use the patch, user can not upload the file with same name twice even not use
netty file server in spark1.6. We really want that case?
If yes, we should document that  user can not ship file with same name twice in pig on spark
otherwise let users to ship file with same twice when they don't use netty file server.  Another
solution we can resolve this jira after we upgrade to spark2.1.  Can you give me any suggestion?

> Several ComputeSpec test cases fail
> -----------------------------------
>                 Key: PIG-5176
>                 URL: https://issues.apache.org/jira/browse/PIG-5176
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: Nandor Kollar
>            Assignee: Nandor Kollar
>             Fix For: spark-branch
>         Attachments: PIG-5176.patch
> Several ComputeSpec test cases failed on my cluster:
> ComputeSpec_5 - ComputeSpec_13
> These scripts have a ship() part in the define, where the ship includes the script file
too, so we add the same file to spark context twice. This is not a problem with Hadoop, but
looks like Spark doesn't like adding the same filename twice:
> {code}
> Caused by: java.lang.IllegalArgumentException: requirement failed: File PigStreamingDepend.pl
already registered.
>         at scala.Predef$.require(Predef.scala:233)
>         at org.apache.spark.rpc.netty.NettyStreamManager.addFile(NettyStreamManager.scala:69)
>         at org.apache.spark.SparkContext.addFile(SparkContext.scala:1386)
>         at org.apache.spark.SparkContext.addFile(SparkContext.scala:1348)
>         at org.apache.spark.api.java.JavaSparkContext.addFile(JavaSparkContext.scala:662)
>         at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.addResourceToSparkJobWorkingDirectory(SparkLauncher.java:462)
>         at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.shipFiles(SparkLauncher.java:371)
>         at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.addFilesToSparkJob(SparkLauncher.java:357)
>         at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.uploadResources(SparkLauncher.java:235)
>         at org.apache.pig.backend.hadoop.executionengine.spark.SparkLauncher.launchPig(SparkLauncher.java:222)
>         at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:290)
> {code}

This message was sent by Atlassian JIRA

View raw message