hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (Jira)" <j...@apache.org>
Subject [jira] [Work logged] (HIVE-23817) Pushing TopN Key operator PKFK inner joins
Date Tue, 21 Jul 2020 11:17:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-23817?focusedWorklogId=461528&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-461528
]

ASF GitHub Bot logged work on HIVE-23817:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 21/Jul/20 11:16
            Start Date: 21/Jul/20 11:16
    Worklog Time Spent: 10m 
      Work Description: kasakrisz commented on a change in pull request #1228:
URL: https://github.com/apache/hive/pull/1228#discussion_r458020139



##########
File path: ql/src/test/queries/clientpositive/topnkey_inner_join.q
##########
@@ -0,0 +1,49 @@
+drop table if exists customer;
+drop table if exists orders;
+
+create table customer (id int, name string, email string);
+create table orders (customer_id int not null enforced, amount int);
+
+alter table customer add constraint pk_customer_id primary key (id) disable novalidate rely;
+alter table orders add constraint fk_order_customer_id foreign key (customer_id) references
customer(id) disable novalidate rely;
+
+insert into customer values
+  (4, 'Heisenberg', 'heisenberg@email.com'),
+  (3, 'Smith', 'smith@email.com'),
+  (2, 'Jones', 'jones@email.com'),
+  (1, 'Robinson', 'robinson@email.com');
+
+insert into orders values
+  (2, 200),
+  (3, 40),
+  (1, 100),
+  (1, 50),
+  (3, 30);
+
+set hive.optimize.topnkey=true;
+set hive.optimize.limittranspose=false;
+
+select 'positive: order by columns are coming from child table';
+explain select * from orders join customer on customer.id = orders.customer_id order by orders.amount
limit 3;
+explain select * from orders join customer on customer.id = orders.customer_id order by orders.customer_id,
orders.amount limit 3;
+explain select * from customer join orders on orders.customer_id = customer.id order by orders.amount,
orders.customer_id limit 3;
+
+select 'negative: order by columns are coming from referenced table';
+explain select * from orders join customer on customer.id = orders.customer_id order by customer.name
limit 3;
+explain select * from orders join customer on customer.id = orders.customer_id order by customer.email,
customer.name limit 3;
+
+select 'negative: 1st order by columns is coming from referenced table';
+explain select * from orders join customer on customer.id = orders.customer_id order by customer.name,
orders.amount limit 3;
+
+select 'mixed/positive: 1st n order by columns are coming from child table';
+explain select * from orders join customer on customer.id = orders.customer_id order by orders.amount,
customer.name limit 3;
+
+select 'negative: no PK/FK';
+alter table customer drop constraint pk_customer_id;
+alter table orders drop constraint fk_order_customer_id;
+explain select * from customer join orders on customer.id = orders.customer_id order by customer.id
limit 3;
+
+select 'negatie: no RELY';
+alter table customer add constraint pk_customer_id primary key (id) disable novalidate;
+alter table orders add constraint fk_order_customer_id foreign key (customer_id) references
customer(id) disable novalidate;
+explain select * from customer join orders on customer.id = orders.customer_id order by customer.id
limit 3;

Review comment:
       It is also worth to execute these queries to catch correctness issues




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 461528)
    Time Spent: 0.5h  (was: 20m)

> Pushing TopN Key operator PKFK inner joins
> ------------------------------------------
>
>                 Key: HIVE-23817
>                 URL: https://issues.apache.org/jira/browse/HIVE-23817
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Attila Magyar
>            Assignee: Attila Magyar
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> If there is primary key foreign key relationship between the tables we can push the topnkey
operator through the join.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message