cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-1472) Add bitmap secondary indexes
Date Fri, 12 Apr 2013 05:25:18 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003216#comment-13003216
] 

Stu Hood edited comment on CASSANDRA-1472 at 4/12/13 5:23 AM:
--------------------------------------------------------------

tjake: Yea: opening a separate ticket to discuss a generic file format makes sense.
----
I had a realization about this implementation of secondary indexes: I was originally thinking
we'd be able to push all boolean queries down to the indexes on a per-sstable basis, but this
is unfortunately not the case. We will not be able to push 'AND' on separate indexes down
to the sstables themselves: we'd need to join the index from all sstables, since a row might
contain one clause in one sstable, and another clause in another sstable.

EDIT: This is roughly equivalent to what we'd need to do with a KEYS index (seek-wise), meaning
that the advantage is mostly in space utilization and lack of locks.
EDIT2: So, there _is_ a way to execute AND queries directly per SSTable, but it involves some
uncertainty. For a particular row, if a value involved in a multi-clause-query is NULL in
a particular SSTable, then you have to accept the row as a possible match, and resolve the
uncertainty later. I'm sure there is a way to incoporate the CASSANDRA-2498 timestamp resolution
as well, although it doesn't occur to me at the moment.
                
      was (Author: stuhood):
    tjake: Yea: opening a separate ticket to discuss a generic file format makes sense.
----
I had a realization about this implementation of secondary indexes: I was originally thinking
we'd be able to push all boolean queries down to the indexes on a per-sstable basis, but this
is unfortunately not the case. We will not be able to push 'AND' on separate indexes down
to the sstables themselves: we'd need to join the index from all sstables, since a row might
contain one clause in one sstable, and another clause in another sstable. EDIT: This is roughly
equivalent to what we'd need to do with a KEYS index (seek-wise), meaning that the advantage
is mostly in space utilization and lack of locks.
                  
> Add bitmap secondary indexes
> ----------------------------
>
>                 Key: CASSANDRA-1472
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>         Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz,
anatomy.png, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-1472-rebased-to-0.7-branch.txt, ASF.LICENSE.NOT.GRANTED--0019-Rename-bugfixes-and-fileclose.txt,
v4-bench-c32.txt
>
>
> Bitmap indexes are a very efficient structure for dealing with immutable data. We can
take advantage of the fact that SSTables are immutable by attaching them directly to SSTables
as a new component (supported by CASSANDRA-1471).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message