hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-16600) Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in multi_insert cases
Date Fri, 19 May 2017 09:58:04 GMT


Hive QA commented on HIVE-16600:

Here are the results of testing the latest attachment:

{color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10738 tests executed
*Failed tests:*
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_if_expr] (batchId=144)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_3] (batchId=97)
org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainuser_3] (batchId=97)
org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query24] (batchId=231)

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed

This message is automatically generated.

ATTACHMENT ID: 12868904 - PreCommit-HIVE-Build

> Refactor SetSparkReducerParallelism#needSetParallelism to enable parallel order by in
multi_insert cases
> --------------------------------------------------------------------------------------------------------
>                 Key: HIVE-16600
>                 URL:
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: liyunzhang_intel
>            Assignee: liyunzhang_intel
>         Attachments: HIVE-16600.1.patch, HIVE-16600.2.patch, HIVE-16600.3.patch, HIVE-16600.4.patch,
HIVE-16600.5.patch, HIVE-16600.6.patch, HIVE-16600.7.patch, mr.explain, mr.explain.log.HIVE-16600
> {code}
> set hive.exec.reducers.bytes.per.reducer=256;
> set hive.optimize.sampling.orderby=true;
> drop table if exists e1;
> drop table if exists e2;
> create table e1 (key string, value string);
> create table e2 (key string);
> FROM (select key, cast(key as double) as keyD, value from src order by key) a
>     SELECT key, value
>     SELECT key;
> select * from e1;
> select * from e2;
> {code} 
> the parallelism of Sort is 1 even we enable parallel order by("hive.optimize.sampling.orderby"
is set as "true").  This is not reasonable because the parallelism  should be calcuated by
> this is because SetSparkReducerParallelism#needSetParallelism returns false when [children
size of RS|]
is greater than 1.
> in this case, the children size of {{RS[2]}} is two.
> the logical plan of the case
> {code}
>    TS[0]-SEL[1]-RS[2]-SEL[3]-SEL[4]-FS[5]
>                             -SEL[6]-FS[7]
> {code}

This message was sent by Atlassian JIRA

View raw message