hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yin Huai (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly
Date Wed, 31 Jul 2013 20:23:48 GMT

     [ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yin Huai updated HIVE-4968:
---------------------------

    Summary: When deduplicate multiple SelectOperators, we should update RowResolver accordinly
 (was: Broken plan in MapJoin)
    
> When deduplicate multiple SelectOperators, we should update RowResolver accordinly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-4968
>                 URL: https://issues.apache.org/jira/browse/HIVE-4968
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT key, value
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT *
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>       FROM (SELECT *
>             FROM (SELECT key, value
>                   FROM src) tmp1 ) tmp2
>       JOIN (SELECT count(*) as count
>             FROM src) tmp3
>       ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
>     Map Reduce Local Work
>       Alias -> Map Local Tables:
>         tmp4:tmp2:tmp1:src 
>           Fetch Operator
>             limit: -1
>       Alias -> Map Local Operator Tree:
>         tmp4:tmp2:tmp1:src 
>           TableScan
>             alias: src
>             Select Operator
>               expressions:
>                     expr: key
>                     type: string
>                     expr: value
>                     type: string
>               outputColumnNames: _col0, _col1
>               HashTable Sink Operator
>                 condition expressions:
>                   0 
>                   1 {_col0}
>                 handleSkewJoin: false
>                 keys:
>                   0 []
>                   1 []
>                 Position of Big Table: 1
>   Stage: Stage-4
>     Map Reduce
>       Alias -> Map Operator Tree:
>         $INTNAME 
>             Map Join Operator
>               condition map:
>                    Inner Join 0 to 1
>               condition expressions:
>                 0 
>                 1 {_col0}
>               handleSkewJoin: false
>               keys:
>                 0 []
>                 1 []
>               outputColumnNames: _col2
>               Position of Big Table: 1
>               Select Operator
>                 expressions:
>                       expr: _col0
>                       type: string
>                       expr: _col1
>                       type: string
>                       expr: _col2
>                       type: bigint
>                 outputColumnNames: _col0, _col1, _col2
>                 File Output Operator
>                   compressed: false
>                   GlobalTableId: 0
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>       Local Work:
>         Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message