incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Boxenhorn <da...@lookin2.com>
Subject Predicate Indexes
Date Thu, 03 Jun 2010 07:47:53 GMT
So I've been thinking about the problem of how to do range queries on keys
with random partitioning. I'm new to Cassandra, and I don't know what the
plans are, but I have an idea and I thought I'd just put it out there:
Predicate Indexes.

I would like to be able to define predicate indexes in Cassandra, something
like this:

<ColumnFamily Name="Super1" ColumnType="Super" CompareWith="BytesType"
CompareSubcolumnsWith="BytesType">
        <Index Name="Cat1" Type="Range" Start="CATEGORY1."
Finish="CATEGORY1/" />
        <Index Name="Cat2" Type="Regex" Regex="^CATEGORY2." />
</ColumnFamily>

At each node, Cassandra would maintain indexes for every key that matches
the predicate that each index defines. Within each index, keys would be
ordered by the order implied by Random Partitioner.

A new attribute should be added to KeyRange: Name - i.e. setName(String
name), getName(), etc.

When we loop through the keys, we would pass the last key in as the start
key, until we finish, as we do now. The results would not be ordered, but we
would have very quick access to the entire range implied by the predicate.

I very much want something like this. I am willing to pay the price in disk
space.

Yes, I know that something like this can be approximated by super columns.
But supercolumns have well-known problems, primarily practical limitations
on the size of supercolumns, secondarily the increased number of round-trips
that working with supercolumns necessitates, and tertiarily the management
costs of maintaining the supercolumns by hand.

Mime
View raw message