hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vineet Garg (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21411) LEFT JOIN CONVERT TO INNER JOIN LEAD TO WRONG RESULT
Date Fri, 08 Mar 2019 18:46:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Vineet Garg updated HIVE-21411:
-------------------------------
    Description: 
when i have not assign alias table name to the left side table , left join convert to inner
join. left side table's alias in ast tree called left.

 for example

select nvl(ss_wholesale_cost, 10), d_quarter_name from lulu.store_sales left join lulu.date_dim
on ss_sold_date_sk = d_date_sk limit 10;

{noformat}
| STAGE DEPENDENCIES: |
| Stage-1 is a root stage |
| Stage-0 depends on stages: Stage-1 |
| |
| STAGE PLANS: |
| Stage: Stage-1 |
| Map Reduce |
| Map Operator Tree: |
| TableScan |
| alias: left |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Filter Operator |
| predicate: ss_sold_date_sk is not null (type: boolean) |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: ss_wholesale_cost (type: decimal(7,2)), ss_sold_date_sk (type: bigint) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Reduce Output Operator |
| key expressions: _col1 (type: bigint) |
| sort order: + |
| Map-reduce partition columns: _col1 (type: bigint) |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| value expressions: _col0 (type: decimal(7,2)) |
| TableScan |
| alias: date_dim |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Filter Operator |
| predicate: d_date_sk is not null (type: boolean) |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: d_date_sk (type: bigint), d_quarter_name (type: string) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Reduce Output Operator |
| key expressions: _col0 (type: bigint) |
| sort order: + |
| Map-reduce partition columns: _col0 (type: bigint) |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| value expressions: _col1 (type: string) |
| Reduce Operator Tree: |
| Join Operator |
| condition map: |
| Inner Join 0 to 1 |
| keys: |
| 0 _col1 (type: bigint) |
| 1 _col0 (type: bigint) |
| outputColumnNames: _col0, _col3 |
| Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: NVL(_col0,10) (type: decimal(12,2)), _col3 (type: string) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE |
| Limit |
| Number of rows: 100 |
| Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE |
| File Output Operator |
| compressed: false |
| Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE |
| table: |
| input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
| output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
| serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
| |
| Stage: Stage-0 |
| Fetch Operator |
| limit: 100 |
| Processor Tree: |
| ListSink |
| |
+--------------------------------------------------------------------------------------------------------+--+

{noformat}
 

  was:
when i have not assign alias table name to the left side table , left join convert to inner
join. left side table's alias in ast tree called left.

 for example

select nvl(ss_wholesale_cost, 10), d_quarter_name from lulu.store_sales left join lulu.date_dim
on ss_sold_date_sk = d_date_sk limit 10;

| STAGE DEPENDENCIES: |
| Stage-1 is a root stage |
| Stage-0 depends on stages: Stage-1 |
| |
| STAGE PLANS: |
| Stage: Stage-1 |
| Map Reduce |
| Map Operator Tree: |
| TableScan |
| alias: left |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Filter Operator |
| predicate: ss_sold_date_sk is not null (type: boolean) |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: ss_wholesale_cost (type: decimal(7,2)), ss_sold_date_sk (type: bigint) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| Reduce Output Operator |
| key expressions: _col1 (type: bigint) |
| sort order: + |
| Map-reduce partition columns: _col1 (type: bigint) |
| Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE |
| value expressions: _col0 (type: decimal(7,2)) |
| TableScan |
| alias: date_dim |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Filter Operator |
| predicate: d_date_sk is not null (type: boolean) |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: d_date_sk (type: bigint), d_quarter_name (type: string) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| Reduce Output Operator |
| key expressions: _col0 (type: bigint) |
| sort order: + |
| Map-reduce partition columns: _col0 (type: bigint) |
| Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE |
| value expressions: _col1 (type: string) |
| Reduce Operator Tree: |
| Join Operator |
| condition map: |
| Inner Join 0 to 1 |
| keys: |
| 0 _col1 (type: bigint) |
| 1 _col0 (type: bigint) |
| outputColumnNames: _col0, _col3 |
| Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE |
| Select Operator |
| expressions: NVL(_col0,10) (type: decimal(12,2)), _col3 (type: string) |
| outputColumnNames: _col0, _col1 |
| Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE |
| Limit |
| Number of rows: 100 |
| Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE |
| File Output Operator |
| compressed: false |
| Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE |
| table: |
| input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
| output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
| serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
| |
| Stage: Stage-0 |
| Fetch Operator |
| limit: 100 |
| Processor Tree: |
| ListSink |
| |
+--------------------------------------------------------------------------------------------------------+--+

 


> LEFT JOIN CONVERT TO INNER JOIN LEAD TO WRONG RESULT
> ----------------------------------------------------
>
>                 Key: HIVE-21411
>                 URL: https://issues.apache.org/jira/browse/HIVE-21411
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 2.1.1, 2.2.0, 2.3.0
>            Reporter: xialu
>            Assignee: Ashutosh Chauhan
>            Priority: Critical
>
> when i have not assign alias table name to the left side table , left join convert to
inner join. left side table's alias in ast tree called left.
>  for example
> select nvl(ss_wholesale_cost, 10), d_quarter_name from lulu.store_sales left join lulu.date_dim
on ss_sold_date_sk = d_date_sk limit 10;
> {noformat}
> | STAGE DEPENDENCIES: |
> | Stage-1 is a root stage |
> | Stage-0 depends on stages: Stage-1 |
> | |
> | STAGE PLANS: |
> | Stage: Stage-1 |
> | Map Reduce |
> | Map Operator Tree: |
> | TableScan |
> | alias: left |
> | Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE
|
> | Filter Operator |
> | predicate: ss_sold_date_sk is not null (type: boolean) |
> | Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE
|
> | Select Operator |
> | expressions: ss_wholesale_cost (type: decimal(7,2)), ss_sold_date_sk (type: bigint)
|
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE
|
> | Reduce Output Operator |
> | key expressions: _col1 (type: bigint) |
> | sort order: + |
> | Map-reduce partition columns: _col1 (type: bigint) |
> | Statistics: Num rows: 200 Data size: 28499 Basic stats: COMPLETE Column stats: NONE
|
> | value expressions: _col0 (type: decimal(7,2)) |
> | TableScan |
> | alias: date_dim |
> | Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE
|
> | Filter Operator |
> | predicate: d_date_sk is not null (type: boolean) |
> | Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE
|
> | Select Operator |
> | expressions: d_date_sk (type: bigint), d_quarter_name (type: string) |
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE
|
> | Reduce Output Operator |
> | key expressions: _col0 (type: bigint) |
> | sort order: + |
> | Map-reduce partition columns: _col0 (type: bigint) |
> | Statistics: Num rows: 200 Data size: 25639 Basic stats: COMPLETE Column stats: NONE
|
> | value expressions: _col1 (type: string) |
> | Reduce Operator Tree: |
> | Join Operator |
> | condition map: |
> | Inner Join 0 to 1 |
> | keys: |
> | 0 _col1 (type: bigint) |
> | 1 _col0 (type: bigint) |
> | outputColumnNames: _col0, _col3 |
> | Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE
|
> | Select Operator |
> | expressions: NVL(_col0,10) (type: decimal(12,2)), _col3 (type: string) |
> | outputColumnNames: _col0, _col1 |
> | Statistics: Num rows: 220 Data size: 31348 Basic stats: COMPLETE Column stats: NONE
|
> | Limit |
> | Number of rows: 100 |
> | Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE
|
> | File Output Operator |
> | compressed: false |
> | Statistics: Num rows: 100 Data size: 14200 Basic stats: COMPLETE Column stats: NONE
|
> | table: |
> | input format: org.apache.hadoop.mapred.SequenceFileInputFormat |
> | output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat |
> | serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe |
> | |
> | Stage: Stage-0 |
> | Fetch Operator |
> | limit: 100 |
> | Processor Tree: |
> | ListSink |
> | |
> +--------------------------------------------------------------------------------------------------------+--+
> {noformat}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message