phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ethan Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PHOENIX-153) Implement TABLESAMPLE clause
Date Sat, 09 Sep 2017 19:36:02 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ethan Wang updated PHOENIX-153:
-------------------------------
    Description: 
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next
hint based on the region boundaries of the table to only return n rows per region.

When TABLESAMPLE  clause is used, Phoenix will sample (N) percent of the the hbase table with
only O(M) run time complexity. (N is size of table, M is size of stats)

[Update]
Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2

Source Code: 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25

  was:
Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip next
hint based on the region boundaries of the table to only return n rows per region.

When TABLESAMPLE  clause is used, Phoenix will sample (n) percent of the the hbase table with
only O(M) run time complexity. (N is size of table, M is size of stats)

[Update]
Syntax of using table sampling:
select * from PERSON TABLESAMPLE(45);
select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2

Source Code: 
https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25


> Implement TABLESAMPLE clause
> ----------------------------
>
>                 Key: PHOENIX-153
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-153
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: James Taylor
>            Assignee: Ethan Wang
>              Labels: enhancement
>             Fix For: 4.12.0
>
>         Attachments: Sampling_Accuracy_Performance.jpg
>
>
> Support the standard SQL TABLESAMPLE clause by implementing a filter that uses a skip
next hint based on the region boundaries of the table to only return n rows per region.
> When TABLESAMPLE  clause is used, Phoenix will sample (N) percent of the the hbase table
with only O(M) run time complexity. (N is size of table, M is size of stats)
> [Update]
> Syntax of using table sampling:
> select * from PERSON TABLESAMPLE(45);
> select count( * ) from PERSON TABLESAMPLE (49) LIMIT 2
> Source Code: 
> https://git-wip-us.apache.org/repos/asf?p=phoenix.git;a=commitdiff;h=5e33dc12bc088bd0008d89f0a5cd7d5c368efa25



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message