cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-749) Secondary indices for column families
Date Fri, 12 Mar 2010 23:39:27 GMT


Jonathan Ellis commented on CASSANDRA-749:

> Is it worth creating a secondary index that only contains local data, versus a distributed
secondary index (a normal ColumnFamily?) 

I think my initial reasoning was wrong here.  I was anti-local-indexes because "we have to
query the full cluster for any index lookup, since we are throwing away our usual partitioning

Which is true, but it ignores the fact that, in most cases, you will have to "query the full
cluster" to get the actual matching rows, b/c the indexed rows will be spread across all machines.
 So, having local indexes is better in the common case, since it actually saves a round trip
from querying a the index to querying the rows.

Also, having each node index the rows it has locally means you don't have to worry about sharding
a very large index since it happens automatically.

Finally, it lets us use the local commitlog to keep index + data in sync.

> Secondary indices for column families
> -------------------------------------
>                 Key: CASSANDRA-749
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Gary Dusbabek
>            Assignee: Gary Dusbabek
>            Priority: Minor
>             Fix For: 0.8
>         Attachments: 0001-simple-secondary-indices.patch, views-discussion.txt

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message