cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Issue Comment Edited: (CASSANDRA-1472) Add bitmap secondary indexes
Date Wed, 12 Jan 2011 05:33:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12980568#action_12980568
] 

Stu Hood edited comment on CASSANDRA-1472 at 1/12/11 12:33 AM:
---------------------------------------------------------------

>> I think this implementation should probably be the only secondary index format we
support (What's the value of keeping KEYS over this?)
> Agreed, pending the optimizations mentioned in previous comments.
I need to retract this statement: I don't think we should remove the KEYS index, for a few
reasons:
 # The false positives generated by these bitmap indexes have worst case behaviour with rapidly
changing data: if a row matches the index in _any sstable_, we have to read all sstables that
contain the row in order to resolve the data and determine whether it _actually_ matches.
On the other hand, they will be damn near optimal when you are doing a smaller fraction of
updates
 # The reason the implementation is called KEYS is that it is a traditional index, containing
only a pointer to the base data (a key). They are only 1 step away from being fully materialized
views, which has a lot of potential for the entity groups that Ellis is interested in
 # KEYS indexes will gain _very_ nice benefits from the compression introduced by CASSANDRA-674
(since they contain value-sorted data)

EDIT: Shoot... as soon as I posted this I thought of a solution to #1. Anyway: consider the
later items.

      was (Author: stuhood):
    >> I think this implementation should probably be the only secondary index format
we support (What's the value of keeping KEYS over this?)
> Agreed, pending the optimizations mentioned in previous comments.
I need to retract this statement: I don't think we should remove the KEYS index, for a few
reasons:
 # The false positives generated by these bitmap indexes have worst case behaviour with rapidly
changing data: if a row matches the index in _any sstable_, we have to read all sstables that
contain the row in order to resolve the data and determine whether it _actually_ matches.
On the other hand, they will be damn near optimal when you are doing a smaller fraction of
updates
 # The reason the implementation is called KEYS is that it is a traditional index, containing
only a pointer to the base data (a key). They are only 1 step away from being fully materialized
views, which has a lot of potential for the entity groups that Ellis is interested in
 # KEYS indexes will gain _very_ nice benefits from the compression introduced by CASSANDRA-674
(since they contain value-sorted data)
  
> Add bitmap secondary indexes
> ----------------------------
>
>                 Key: CASSANDRA-1472
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7.1
>
>         Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 0019-Rename-bugfixes-and-fileclose.txt,
1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt
>
>
> Bitmap indexes are a very efficient structure for dealing with immutable data. We can
take advantage of the fact that SSTables are immutable by attaching them directly to SSTables
as a new component (supported by CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message