cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-11990) Address rows rather than partitions in SASI
Date Sun, 26 Jun 2016 15:21:33 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15350152#comment-15350152
] 

Alex Petrov edited comment on CASSANDRA-11990 at 6/26/16 3:21 PM:
------------------------------------------------------------------

I've done a small investigation of what it'd take and talked to several people about potential
scenarios. First of all, I'd indicate that it'll be a rather big change, which will require
a new format for writing the SASI {{TokenTree}}. I'll list several steps that would need to
be taken: 

  * tl;dr version: we have to extend TokenTree to fit row offset along with partition key.
More elaborate version: currently SASI is a highly optimised tree that aims to encode a tree
of {{long token}}/{{short + int}} entries. Since the max offset size does not exceed 48 bytes,
there are more optimisations involved. It takes several optimisation steps to improve read
performance and storage overhead. Short description of current format can be found [here|https://gist.github.com/ifesdjeen/0436faf9a66b401ace0ad0947d256317].
Since we'll have to hold two offsets (partition and row offset, Partition offset is required
to read the PK and static rows etc), on the first step, for the proof of concept, we'll reduce
the number of distinction to more simple cases (single and multiple offsets).  The rest of
possible combinations of optimisation (with the most obvious being when both items fit into
the single long, and possibly adding more distinctions if they're flexibly skippable/addressable).

  * currently, we can only read the decorated partition key, so we need to extend storage
to address a single row as well
  * we need to extend TokenTree to support other partitioners (whether or not it's going to
be done in scope of this ticket, we'll have to make sure we're not making it harder to extend
it this way.
  * there might be a need to store the order-preserving hash of clustering keys for queries
where row is split across multiple SSTables, although I have to gather more information on
that one, as we might be able to resolve rows after reading them from sstables. 
  * we'll need to find migration/upgrade paths from current format, which may involve re-indexing
and failing queries while upgrade is in process or supporting two format versions at read
time, to support reads from old format while indexes are rebuilt. 

cc [~xedin] [~beobal] [~jrwest] 


was (Author: ifesdjeen):
I've done a small investigation of what it'd take and talked to several people about potential
scenarios. First of all, I'd indicate that it'll be a rather big change, which will require
a new format for writing the SASI {{TokenTree}}. I'll list several steps that would need to
be taken: 

  * tl;dr version: we have to extend TokenTree to fit row offset along with partition key.
More elaborate version: currently SASI is a highly optimised tree that aims to encode a tree
of {{long token}}/{{short + int}} entries. Since the max offset size does not exceed 48 bytes,
there are more optimisations involved. It takes several optimisation steps to improve read
performance and storage overhead. Short description of current format can be found [here|https://gist.github.com/ifesdjeen/0436faf9a66b401ace0ad0947d256317].
Since we'll have to hold two offsets (partition and row offset, Partition offset is required
to read the PK and static rows etc), on the first step, for the proof of concept, we'll reduce
the number of distinction to more simple cases (single and multiple offsets).  The rest of
possible combinations of optimisation (with the most obvious being when both items fit into
the single long, and possibly adding more distinctions if they're flexibly skippable/addressable).

  * we need to extend TokenTree to support other partitioners (whether or not it's going to
be done in scope of this ticket, we'll have to make sure we're not making it harder to extend
it this way.
  * there might be a need to store the order-preserving hash of clustering keys for queries
where row is split across multiple SSTables, although I have to gather more information on
that one, as we might be able to resolve rows after reading them from sstables. 
  * we'll need to find migration/upgrade paths from current format, which may involve re-indexing
and failing queries while upgrade is in process or supporting two format versions at read
time, to support reads from old format while indexes are rebuilt. 

cc [~xedin] [~beobal] [~jrwest] 

> Address rows rather than partitions in SASI
> -------------------------------------------
>
>                 Key: CASSANDRA-11990
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11990
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL
>            Reporter: Alex Petrov
>            Assignee: Alex Petrov
>
> Currently, the lookup in SASI index would return the key position of the partition. After
the partition lookup, the rows are iterated and the operators are applied in order to filter
out ones that do not match.
> bq. TokenTree which accepts variable size keys (such would enable different partitioners,
collections support, primary key indexing etc.), 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message