phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-258) Use skip scan when SELECT DISTINCT on leading row key column(s)
Date Tue, 31 May 2016 01:25:12 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15307059#comment-15307059
] 

Lars Hofhansl commented on PHOENIX-258:
---------------------------------------

Thanks for the full description! I'll add that code later today or tomorrow.

As for the RCVs... When I have a 3-part key (say p1, p2, p3) and do SELECT DISTINCT (p1, p2)
FROM t, shouldn't I get more than 1 result back (assuming there are multiple rows for a combination
of p1 and p2)? There seems to be something at odds here. This always returns a single result.
This happens with or without this optimizations. I just do not know how to verify correct
behavior, when I do not understand that behavior.

bq. For the previousKey logic in DistinctPrefixFilter, you need to add trailing 0xFF bytes
to ensure you're not skipping back too far. 

The code's doing that: nullPad + previousKey (and still makes one copy of the key only). I
think that's all that's needed since we're not looking for a startrow, but rather the next
seek backwards. Since scanning _inside_ a row is forward this should all be cool.


> Use skip scan when SELECT DISTINCT on leading row key column(s)
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-258
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-258
>             Project: Phoenix
>          Issue Type: Task
>            Reporter: ryang-sfdc
>            Assignee: Lars Hofhansl
>             Fix For: 4.8.0
>
>         Attachments: 258-WIP.txt, 258-v1.txt, 258-v10.txt, 258-v2.txt, 258-v3.txt, 258-v4.txt,
258-v5.txt, 258-v6.txt, 258-v7.txt, 258-v8.txt, 258-v9.txt, 258.txt, DistinctFixedPrefixFilter.java,
in-clause.png
>
>
> create table(a varchar(32) not null, date date not null constraint pk primary key(a,date))
> [["PLAN"],["CLIENT PARALLEL 94-WAY FULL SCAN OVER foo"],["    SERVER AGGREGATE INTO ORDERED
DISTINCT ROWS BY [a]"],["CLIENT MERGE SORT"]]             
> We should skip scan.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message