phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ethan Wang (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PHOENIX-153) Implement TABLESAMPLE clause
Date Sun, 25 Jun 2017 20:45:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044823#comment-16044823
] 

Ethan Wang edited comment on PHOENIX-153 at 6/25/17 8:44 PM:
-------------------------------------------------------------

Spec of this patch. Feedback plz.

Updated Jun25. Syntax to be similar to Calcite and PostgresSQL

++
++Belows are SUPPORTED
++
===BASE CASE====
select * from Person;
select * from PERSON TABLESAMPLE(45);
select * from PERSON TABLESAMPLE (45);

===WHERE CLAUSE====
select * from PERSON where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE (49) where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE (49) where ADDRESS = 'CA' OR name>'tina3' ORDER BY ADDRESS
LIMIT 2;

===Wired Table===
select * from LOCAL_ADDRESS TABLESAMPLE (79);
select * from SYSTEM.STATS TABLESAMPLE (41);

===CORNER CASE===
select * from PERSON TABLESAMPLE (0);
select * from PERSON TABLESAMPLE (100.0);
select * from PERSON TABLESAMPLE (145.99);  (out of boundary exception)
select * from PERSON TABLESAMPLE ko; (syntax error exception)

===AGGREGATION===
select count(*) from PERSON TABLESAMPLE (49) LIMIT 2
select count(*) from (select NAME from PERSON TABLESAMPLE (49) limit 20)

===SUB QUERY===
select * from (select /*+NO_INDEX*/ * from PERSON tablesample (10) where Name > 'tina10')
where ADDRESS = 'CA'

===JOINS===
select * from PERSON1, PERSON2 tablesample (70) where PERSON1.Name = PERSON2.NAME
select /*+NO_INDEX*/ count(*) from PERSON tableSample (19), US_POPULATION tableSample (28)
where PERSON.Name > US_POPULATION.STATE

===QUERY being OPTMIZED===
select * from PERSON tablesample (80) (may go to IDX_ADDRESS_PERSON index table, but table
sampling rate carry on)

===INSERT SELECT====
upsert into personbig(ID, ADDRESS) select id, address from personbig tablesample (1);


was (Author: aertoria):
Spec of this patch. Feedback plz.
++
++Belows are SUPPORTED
++
===BASE CASE====
select * from Person;
select * from PERSON TABLESAMPLE 0.45;

===WHERE CLAUSE====
select * from PERSON where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE 0.49 where ADDRESS = 'CA' OR name>'tina3';
select * from PERSON TABLESAMPLE 0.49 where ADDRESS = 'CA' OR name>'tina3' LIMIT 1;

===Wired Table===
select * from LOCAL_ADDRESS TABLESAMPLE 0.79;
select * from SYSTEM.STATS TABLESAMPLE 0.41;

===CORNER CASE===
select * from PERSON TABLESAMPLE 0;
select * from PERSON TABLESAMPLE 1.45;
select * from PERSON TABLESAMPLE kko;

===AGGREGATION===
select count( * ) from PERSON TABLESAMPLE 0.5 LIMIT 2
select count( * ) from (select NAME from PERSON limit 20)

===SUB QUERY===
select * from (select /*+NO_INDEX*/ * from PERSON tablesample 0.1 where Name > 'tina10')
where ADDRESS = 'CA'

===JOINS===
select * from PERSON1, PERSON2 tablesample 0.7 where PERSON1.Name = PERSON2.NAME

===QUERY being OPTMIZED===
select * from PERSON tablesample 0.8 (goes to IDX_ADDRESS_PERSON index table, table sample
carry on)

===INSERT SELECT====
upsert into personbig(ID, ADDRESS) select id, address from personbig tablesample 0.01;

> Implement TABLESAMPLE clause
> ----------------------------
>
>                 Key: PHOENIX-153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-153
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Ethan Wang
>              Labels: enhancement
>
> Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip
next hint based on the region boundaries of the table to only return n rows per region.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message