hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-5945) ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those tables which are not used in the child of this conditional task.
Date Mon, 30 Dec 2013 05:25:51 GMT

    [ https://issues.apache.org/jira/browse/HIVE-5945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858568#comment-13858568
] 

Hive QA commented on HIVE-5945:
-------------------------------



{color:green}Overall{color}: +1 all checks pass

Here are the results of testing the latest attachment:
https://issues.apache.org/jira/secure/attachment/12620793/HIVE-5945.5.patch.txt

{color:green}SUCCESS:{color} +1 4818 tests passed

Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/767/testReport
Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/767/console

Messages:
{noformat}
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
{noformat}

This message is automatically generated.

ATTACHMENT ID: 12620793

> ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask also sums those tables which
are not used in the child of this conditional task.
> -----------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-5945
>                 URL: https://issues.apache.org/jira/browse/HIVE-5945
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.8.0, 0.9.0, 0.10.0, 0.11.0, 0.12.0, 0.13.0
>            Reporter: Yin Huai
>            Assignee: Navis
>            Priority: Critical
>         Attachments: HIVE-5945.1.patch.txt, HIVE-5945.2.patch.txt, HIVE-5945.3.patch.txt,
HIVE-5945.4.patch.txt, HIVE-5945.5.patch.txt
>
>
> Here is an example
> {code}
> select
>    i_item_id,
>    s_state,
>    avg(ss_quantity) agg1,
>    avg(ss_list_price) agg2,
>    avg(ss_coupon_amt) agg3,
>    avg(ss_sales_price) agg4
> FROM store_sales
> JOIN date_dim on (store_sales.ss_sold_date_sk = date_dim.d_date_sk)
> JOIN item on (store_sales.ss_item_sk = item.i_item_sk)
> JOIN customer_demographics on (store_sales.ss_cdemo_sk = customer_demographics.cd_demo_sk)
> JOIN store on (store_sales.ss_store_sk = store.s_store_sk)
> where
>    cd_gender = 'F' and
>    cd_marital_status = 'U' and
>    cd_education_status = 'Primary' and
>    d_year = 2002 and
>    s_state in ('GA','PA', 'LA', 'SC', 'MI', 'AL')
> group by
>    i_item_id,
>    s_state
> order by
>    i_item_id,
>    s_state
> limit 100;
> {\code}
> I turned off noconditionaltask. So, I expected that there will be 4 Map-only jobs for
this query. However, I got 1 Map-only job (joining strore_sales and date_dim) and 3 MR job
(for reduce joins.)
> So, I checked the conditional task determining the plan of the join involving item. In
ql.plan.ConditionalResolverCommonJoin.resolveMapJoinTask, aliasToFileSizeMap contains all
input tables used in this query and the intermediate table generated by joining store_sales
and date_dim. So, when we sum the size of all small tables, the size of store_sales (which
is around 45GB in my test) will be also counted.  



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message