hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-6668) When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin fails to resolve map joins.
Date Mon, 17 Mar 2014 02:48:43 GMT

    [ https://issues.apache.org/jira/browse/HIVE-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13937420#comment-13937420
] 

Hive QA commented on HIVE-6668:
-------------------------------



{color:red}Overall{color}: -1 at least one tests failed

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12634971/HIVE-6668.2.patch.txt

{color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 5406 tests executed
*Failed tests:*
{noformat}
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join25
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_mapjoin_hook
org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_auto_sortmerge_join_16
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_mapreduce_stack_trace_hadoop20
{noformat}

Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1855/testReport
Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/1855/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 4 tests failed
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12634971

> When auto join convert is on and noconditionaltask is off, ConditionalResolverCommonJoin
fails to resolve map joins.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6668
>                 URL: https://issues.apache.org/jira/browse/HIVE-6668
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.0, 0.14.0
>            Reporter: Yin Huai
>            Assignee: Navis
>            Priority: Blocker
>             Fix For: 0.13.0
>
>         Attachments: HIVE-6668.1.patch.txt, HIVE-6668.2.patch.txt
>
>
> I tried the following query today ...
> {code:sql}
> set mapred.job.map.memory.mb=2048;
> set mapred.job.reduce.memory.mb=2048;
> set mapred.map.child.java.opts=-server -Xmx3072m -Djava.net.preferIPv4Stack=true;
> set mapred.reduce.child.java.opts=-server -Xmx3072m -Djava.net.preferIPv4Stack=true;
> set mapred.reduce.tasks=60;
> set hive.stats.autogather=false;
> set hive.exec.parallel=false;
> set hive.enforce.bucketing=true;
> set hive.enforce.sorting=true;
> set hive.map.aggr=true;
> set hive.optimize.bucketmapjoin=true;
> set hive.optimize.bucketmapjoin.sortedmerge=true;
> set hive.mapred.reduce.tasks.speculative.execution=false;
> set hive.auto.convert.join=true;
> set hive.auto.convert.sortmerge.join=true;
> set hive.auto.convert.sortmerge.join.noconditionaltask=false;
> set hive.auto.convert.join.noconditionaltask=false;
> set hive.auto.convert.join.noconditionaltask.size=100000000;
> set hive.optimize.reducededuplication=true;
> set hive.optimize.reducededuplication.min.reducer=1;
> set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
> set hive.mapjoin.smalltable.filesize=45000000;
> set hive.optimize.index.filter=false;
> set hive.vectorized.execution.enabled=false;
> set hive.optimize.correlation=false;
> select
>    i_item_id,
>    s_state,
>    avg(ss_quantity) agg1,
>    avg(ss_list_price) agg2,
>    avg(ss_coupon_amt) agg3,
>    avg(ss_sales_price) agg4
> FROM store_sales
> JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
> JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
> JOIN customer_demographics on (store_sales.ss_cdemo_sk = customer_demographics.cd_demo_sk)
> JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
> where
>    cd_gender = 'F' and
>    cd_marital_status = 'U' and
>    cd_education_status = 'Primary' and
>    d_year = 2002 and
>    s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
> group by i_item_id, s_state with rollup
> order by
>    i_item_id,
>    s_state
> limit 100;
> {code}
> The log shows ...
> {code}
> 14/03/14 17:05:02 INFO plan.ConditionalResolverCommonJoin: Failed to resolve driver alias
(threshold : 45000000, length mapping : {store=94175, store_sales=48713909726, item=39798667,
customer_demographics=1660831, date_dim=2275902})
> Stage-27 is filtered out by condition resolver.
> 14/03/14 17:05:02 INFO exec.Task: Stage-27 is filtered out by condition resolver.
> Stage-28 is filtered out by condition resolver.
> 14/03/14 17:05:02 INFO exec.Task: Stage-28 is filtered out by condition resolver.
> Stage-3 is selected by condition resolver.
> {code}
> Stage-3 is a reduce join. Actually, the resolver should pick the map join



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message