hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-8859) ColumnStatsTask fails because of SparkMapJoinResolver
Date Thu, 13 Nov 2014 21:52:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-8859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14211350#comment-14211350
] 

Hive QA commented on HIVE-8859:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12681387/HIVE-8859.1-spark.patch

{color:red}ERROR:{color} -1 due to 10 failed/errored test(s), 7234 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample_islocalmode_hook
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_join29
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_auto_sortmerge_join_15
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_optimize_nullscan
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_stats8
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_12
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorization_4
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver_vectorized_shufflejoin
org.apache.hadoop.hive.ql.exec.spark.TestHiveKVResultCache.testResultList
org.apache.hive.hcatalog.streaming.TestStreaming.testEndpointConnection
{noformat}

Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/351/testReport
Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-SPARK-Build/351/console
Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-SPARK-Build-351/

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 10 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12681387 - PreCommit-HIVE-SPARK-Build

> ColumnStatsTask fails because of SparkMapJoinResolver
> -----------------------------------------------------
>
>                 Key: HIVE-8859
>                 URL: https://issues.apache.org/jira/browse/HIVE-8859
>             Project: Hive
>          Issue Type: Sub-task
>          Components: Spark
>    Affects Versions: spark-branch
>            Reporter: Chao
>            Assignee: Chao
>         Attachments: HIVE-8859.1-spark.patch
>
>
> The following query fails:
> {code}
> ANALYZE TABLE src COMPUTE STATISTICS FOR COLUMNS key,value;
> {code}
> The plan looks like:
> {noformat}
> STAGE DEPENDENCIES:
>   Stage-0 is a root stage
>   Stage-2 is a root stage
> STAGE PLANS:
>   Stage: Stage-0
>     Spark
>       Edges:
>         Reducer 2 <- Map 1 (GROUP, 1)
>       DagName: chao_20141113105959_486b4bba-a2da-43c5-bf42-0ee69cd42576:1
>       Vertices:
>         Map 1 
>             Map Operator Tree:
>                 TableScan
>                   alias: src
>                   Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column
stats: NONE
>                   Select Operator
>                     expressions: key (type: string), value (type: string)
>                     outputColumnNames: key, value
>                     Statistics: Num rows: 500 Data size: 5312 Basic stats: COMPLETE Column
stats: NONE
>                     Group By Operator
>                       aggregations: compute_stats(key, 16), compute_stats(value, 16)
>                       mode: hash
>                       outputColumnNames: _col0, _col1
>                       Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE
>                       Reduce Output Operator
>                         sort order: 
>                         Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE
>                         value expressions: _col0 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:string,numbitvectors:int>),
_col1 (type: struct<columntype:string,maxlength:bigint,sumlength:bigint,count:bigint,countnulls:bigint,bitvector:string,numbitvectors:int>)
>         Reducer 2 
>             Reduce Operator Tree:
>               Group By Operator
>                 aggregations: compute_stats(VALUE._col0), compute_stats(VALUE._col1)
>                 mode: mergepartial
>                 outputColumnNames: _col0, _col1
>                 Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats:
NONE
>                 Select Operator
>                   expressions: _col0 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint>),
_col1 (type: struct<columntype:string,maxlength:bigint,avglength:double,countnulls:bigint,numdistinctvalues:bigint>)
>                   outputColumnNames: _col0, _col1
>                   Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column stats:
NONE
>                   File Output Operator
>                     compressed: false
>                     Statistics: Num rows: 1 Data size: 0 Basic stats: PARTIAL Column
stats: NONE
>                     table:
>                         input format: org.apache.hadoop.mapred.TextInputFormat
>                         output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>                         serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>   Stage: Stage-2
>     Column Stats Work
>       Column Stats Desc:
>           Columns: key, value
>           Column Types: string, string
>           Table: src
> {noformat}
> This query will fail because {{SparkMapJoinResolver#createSparkTask}} swaps the order
of two tasks in the root task list. But, this is rather interesting, since if they are both
root tasks, then order shouldn't matter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message