cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-1472) Add bitmap secondary indexes
Date Tue, 11 Jan 2011 10:17:46 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Stu Hood updated CASSANDRA-1472:
--------------------------------

    Attachment: 0.7-1472-v6.tgz

> I renamed KEYS_BITMAP to just BITMAP, fixed some spots that could leak files, and fixed
a compaction bug related to 1916 with testcase.
I incorporated your changes into the latest tarball as 0018, and fixed some silliness in 0019
and 0020.

> There are some changes in here that seem to be bug fixes for other issues, specifically
the changes to CFMetaData.java
Dropped from this patch, and added on CASSANDRA-1962

> I see in SSTableWriter that BMT will fail on secondary indexed CFs now. Why fail though?
Can't they just be built on restart?
Yes, probably: but the naive approach is not very elegant, since when we see the first BMT
append, we'll already have the secondary indexes open, so we need to null them out. A better
approach would need to indicate to the SSTW constructor/factory that we were intending to
write without certain component types... I think this can go in another ticket?

> The whole BitmapIndexWriter Scratch space has me slightly concerned.
There is an alternative to the layout I've implemented here, but it is slower for the most
common query type (equality on one bucket), and only slightly faster for extremely general
index queries (LT/GT involving most/all of the buckets). We can measure the actual overhead
on a single sstable if you'd like. 

> AVRO, I don't see the value here. [...] The value of using our BRAF is you have all the
work to avoid polluting the page cache
I could go either way on this point: on one hand, this is an extremely simple structure. On
the other hand, we get large benefits from compression here, and I'm fairly certain we should
use Avro for the rest of the sstable.

Also, it's very simple to use our FileDataInput implementations here via Avro's SeekableInput
interface, so we don't necessarily need to throw away any effort. See https://github.com/stuhood/cassandra/commit/1a5c9115cb1410519eff15dd3089772b1e550ae7

> I mentioned above that on the fly indexes should be allowed, however this can happen
in a subsequent ticket if you prefer.
Yes, I'd prefer that. It will likely be the highest priority of the 4-5 tickets we need to
create if/when this issue goes in.

> As Nick mentioned it would be nice to have some stats on the index available in JMX,
for a subsequent ticket.
Agreed.

> I think this implementation should probably be the only secondary index format we support
(What's the value of keeping KEYS over this?)
Agreed, pending the optimizations mentioned in previous comments.

> Add bitmap secondary indexes
> ----------------------------
>
>                 Key: CASSANDRA-1472
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1472
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Stu Hood
>            Assignee: Stu Hood
>             Fix For: 0.7.1
>
>         Attachments: 0.7-1472-v5.tgz, 0.7-1472-v6.tgz, 0019-Rename-bugfixes-and-fileclose.txt,
1472-v3.tgz, 1472-v4.tgz, 1472-v5.tgz, anatomy.png, v4-bench-c32.txt
>
>
> Bitmap indexes are a very efficient structure for dealing with immutable data. We can
take advantage of the fact that SSTables are immutable by attaching them directly to SSTables
as a new component (supported by CASSANDRA-1471).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message