hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7503) Support Hive's multi-table insert query with Spark [Spark Branch]
Date Sat, 20 Sep 2014 01:10:35 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141643#comment-14141643
] 

Hive QA commented on HIVE-7503:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12670148/HIVE-7503.7-spark.patch

{color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 6437 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestNegativeCliDriver.testNegativeCliDriver_fs_default_name2
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_insert1
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union18
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union19
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_union_remove_6
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/139/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/139/console
Test logs: http://ec2-54-176-176-199.us-west-1.compute.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-139/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 6 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12670148

> Support Hive's multi-table insert query with Spark [Spark Branch]
> -----------------------------------------------------------------
>
>                 Key: HIVE-7503
>                 URL: https://issues.apache.org/jira/browse/HIVE-7503
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Chao
>              Labels: spark-m1
>         Attachments: HIVE-7503.1-spark.patch, HIVE-7503.2-spark.patch, HIVE-7503.3-spark.patch,
HIVE-7503.4-spark.patch, HIVE-7503.5-spark.patch, HIVE-7503.6-spark.patch, HIVE-7503.7-spark.patch
>
>
> For Hive's multi insert query (https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML),
there may be an MR job for each insert.  When we achieve this with Spark, it would be nice
if all the inserts can happen concurrently.
> It seems that this functionality isn't available in Spark. To make things worse, the
source of the insert may be re-computed unless it's staged. Even with this, the inserts will
happen sequentially, making the performance suffer.
> This task is to find out what takes in Spark to enable this without requiring staging
the source and sequential insertion. If this has to be solved in Hive, find out an optimum
way to do this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message