phoenix-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Wong (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4845) Support using Row Value Constructors in OFFSET clause to support paging in tables where the sort order of PK columns varies
Date Fri, 14 Dec 2018 18:58:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16721701#comment-16721701
] 

Daniel Wong commented on PHOENIX-4845:
--------------------------------------

h2. SQL Standard and Phoenix-3383

The SQL99 standard states:
{quote}7) Let Rx and Ry be the two <row value constructor>s of the <com-
parison predicate> and let RXi and RYi be the i-th <row value
constructor element>s of Rx and Ry, respectively. "Rx <comp op>
Ry" is true, false, or unknown as follows:
...
"Rx > Ry" is true if and only if RXi = RYi for all i < n and
 RXn > RYn for some n.
{quote}
Phoenix-3383 sought to properly provide SQL standard behavior as it was found that the current
behavior, which supported pagination did not match SQL standard.
Consider tables with pk columns with differing sort orders.
||Row #||A Desc||B||
|1|e|1|
|2|d|2|
|3|c|1|
|4|c|2|
|5|c|3|
|6|b|1|
|7|a|2|

By SQL standard if the select was 

 
{code:java}
SELECT * FROM TABLE ORDER BY A DESC, B WHERE (A, B) > ('c', 2)
 
{code}
We would expect rows 1,2, and 5 to return today we return rows 5,6,7. Note that the rows are
not contiguous and therefore RVC is not suited by SQL standard to pagination. In order to
merge Phoenix-3383 and support SQL standard we must change how we perform pagination.


h1. Recommended Approaches
h2. PHOENIX-4845 Support using Row Value Constructors in OFFSET clause to support paging in
tables where the sort order of PK columns varies.

Provide syntax like:

 
{code:java}
SELECT * FROM TABLE ORDER BY A, B OFFSET (?,?) LIMIT 2000
 
{code}
The scope of this clause is for single table paginated selects with the row key similar to
the base table or index. Exceptions will be thrown if the following happen: Order By does
not match either an index or the base table, more than one table is specified in the from
clause, the RVC in the Offset clause does not match the leading edge of the row keys. 
 The new syntax will trigger the Phoenix client to push the RVC key to the start row of the
HBase scan.
h2. Extend PHOENIX-2606 to support server side cursors.

Extend Phoenix Cursor syntax like:

 
{code:java}
DECLARE MY_TABLE_CURSOR CURSOR SERVER FOR SELECT * FROM TABLE
 
{code}
we would add a row to a new System Table, SYSTEM.CURSORS similar to the following, with PK(CURSOR_NAME,CURSOR_INSTANCE_ID,TABLE)
||CURSOR_NAME||CURSOR_INSTANCE_ID||TABLE||QUERY||LAST_KEY||
|MY_TABLE_CURSOR|123|TABLE|SELECT * FROM TABLE|null|
| | | | | |

On the following fetches similar to :

 
{code:java}
FETCH NEXT 2000 ROWS FROM MY_TABLE_CURSOR
 
{code}
We would fetch from the system table, and construct a a query similar to the stored query
and use the LAST_KEY as the start row of the scan.
h2. Advantages and Disadvantages
h3. OFFSET advantages

Requires less code changes in Phoenix
 Both cursor and offset require being able to push a row key into a scan
h3. OFFSET DISADVANTAGES

Offset is not SQL standard
h3. Server Side Cursor advantages

SQL standard behavior and intended method for pagination
 As service owners being able to provide pagination easily to the end users will provide benefits
 Since the users' intent is clear for pagination, performance and behavior improvements can
be delivered without affecting users.
h3. Disadvantages

CURSOR is that it's stateful, now we need to keep track of the state and query about that
state.
 Larger code changes required as well as management of more metadata.
 
h2. Proposal

Implement OFFSET first, in the future for Phoenix's general usability leveraged it as a piece
of future cursor implementations. 
h1.  

 

> Support using Row Value Constructors in OFFSET clause to support paging in tables where
the sort order of PK columns varies
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-4845
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4845
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Thomas D'Silva
>            Priority: Major
>              Labels: DESC, SFDC
>         Attachments: PHOENIX-offset.txt
>
>
> RVCs along with the LIMIT clause are useful for efficiently paging through rows (see
[http://phoenix.apache.org/paged.html]). This works well if the pk columns are sorted ascending,
we can always use the > operator to query for the next batch of row. 
> However if the PK of a table is (A  DESC, B DESC) we cannot use the following query
to page through the data
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) > (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> Since the rows are sorted by A desc and then by B descending we need change the comparison
order
> {code:java}
> SELECT * FROM TABLE WHERE (A, B) < (?, ?) ORDER BY A DESC, B DESC LIMIT 20
> {code}
> If the PK of a table contains columns with mixed sort order for eg (A  DESC, B) then
we cannot use RVC to page through data. 
> If we supported using RVCs in the offset clause we could use the offset to set the start
row of the scan. Clients would not have to have logic to determine the comparison operator.
This would also support paging through data for tables where the PK columns are sorted in
mixed order. 
> {code:java}
> SELECT * FROM TABLE ORDER BY A DESC, B LIMIT 20 OFFSET (?,?)
> {code}
> We would only allow using the offset if the rows are ordered by the sort order of the
PK columns.
>  
> FYI [~jfernando_sfdc]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message