cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI
Date Fri, 29 Jul 2016 10:48:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15399120#comment-15399120
] 

Alex Petrov commented on CASSANDRA-11990:
-----------------------------------------

During several discussions it's been proposed to evaluate the support for different partitioners,
since it'd help with wider SASI adoption and remove current limitation of Long tokens. I've
evaluated the support, and can conclude that supporting the constant-size tokens can be included
into the patch without large overhead. Patch was adjusted accordingly. There are still several
failing tests, although they'll be fixed shortly. 

Support for variable-size tokens (for partitioners such as {{ByteOrderedPartitioner}} requires
much larger time investment. My personal suggestion is to encode them with the size and avoid
on-disk format changes. This will result into more complex iteration process for variable-size
tokens, since we'll have to skip tokens depending on the size and won't be able to use simple
multiplication for offset calculation. I've made a small patch / proof of concept for variable
size tokens by adding `serializedSize` method into the token tree nodes, currently (for sakes
of POC and in order to save some time), it was done by reusing the `serialize` function and
passing a throwaway byte buffer, and calculating offsets by iterating and reading integers
with token size. It worked just fine for simple cases. I'll mention that SASI code is written
very well and offset calculation methods are very well isolated. 

Having that said, I'd suggest to leave the "algorithmic" heavy-lifting (variable token offset
calculation) for the separate ticket to reduce the scope of current ticket. Since it's not
going to require the on-disk format changes, we can safely postpone this work. 


Another thing that's been mentioned was is to include the column offset into clustering offset
long. I'll be evaluating this proposal in terms of performance today. It seems that we can
avoid increasing the size of {{long[]}} array that hold offsets and this change can help to
avoid post-filtering alltogether. Additional optimisation (which, once again, could be left
for the follow-up patch) is to avoid the second seek within the data file for cases when we
are only querying columns that are indexed. This can be a significant performance improvement,
although it'd be good to discuss whether such queries are widely used.

cc [~slebresne] [~iamaleksey] [~jbellis] [~beobal]


> Address rows rather than partitions in SASI
> -------------------------------------------
>
>                 Key: CASSANDRA-11990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>         Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the partition. After
the partition lookup, the rows are iterated and the operators are applied in order to filter
out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different partitioners,
collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message