cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11990) Address rows rather than partitions in SASI
Date Tue, 09 Aug 2016 07:12:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413079#comment-15413079
] 

Alex Petrov commented on CASSANDRA-11990:
-----------------------------------------

bq. Can you move drop-data and rebuild related changes to the separate branch so we can keep
changes to token tree and bug fixes separate since they are separate tickets?

Sure, this was already done: [CASSANDRA-12374] and [CASSANDRA-12378].

bq. extend tokens sizes to variable size. instead of having intermediate of multi-size fixed
tokens...

As far as I understood, SASI relies heavily on the fact that the tokens are fixed size. Since
there will never be
a situation where there's more than one partitioner per sstable, it's a good idea to always
be able to calculate
offsets based on the token size.

For the partitioners with variable size tokens, we can send them the "slow path". In the end,
{{ByteOrderedPartitioner}}
is not suggested to for wide use anyways and making it as fast as calculating offsets for
fixed size tokens won't be
possible too. So we'd use the fixed-size tokens for performance where possible.

bq. the way TokenTreeSerializationHelper is currently done is not optimal since it has to
be explicitly carried around ... we should either use IPartitioner interface or nothing at
all.

Completely agree with you. It was only done because we always have to pass it all the way
through to the {{TokenBuilder}}
instances. Higher level abstractions only "know" about it to delegate it further. This helper
is just a small adapter
between partitioner and SASI code, too.

I still suggest keeping changes related to Tokens. When we decide to include the support for
any
partitioner, very similar changes will still have to be done. We can improve the situation
with current abstraction
by relying on the fact that {{Partitioner}} is a singleton in {{DatabaseDescriptor}} so higher
level abstractions
will never even see it.

Rolling back everything will be a large chunk of work both now in order to undo it (in combination
with all
test changes and original PR changes) and re-introducing it shortly when we do RP (or any
other partitioner)
support. And unfortunately there's no other way but passing Token instances (or SASI wrapper
on top of them)
to bring in that support, so this change, even though it's touching quite a few places is
still minimal.

> Address rows rather than partitions in SASI
> -------------------------------------------
>
>                 Key: CASSANDRA-11990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL, sasi
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>         Attachments: perf.pdf, size_comparison.png
>
>
> Currently, the lookup in SASI index would return the key position of the partition. After
the partition lookup, the rows are iterated and the operators are applied in order to filter
out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different partitioners,
collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message