cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1156) support querying multiple nodes for index scan
Date Thu, 29 Jul 2010 21:55:17 GMT


Jonathan Ellis commented on CASSANDRA-1156:

pushed fancy statistics-based parallelization to CASSANDRA-1337.  this merely adds support
for scanning across multiple nodes to satisfy a query serially, as well as ConcurrencyLevel-awareness.

on reflection, it seems that forcing the user to wrap multiget/range scan/index scan in a
RowPredicate is the wrong move, even if the eventual proliferation of _count methods pains
me.  0001 has the updates to thrift to make that change broken out. (as usual, `ant gen-thrift-java`
is left as an exercise for the reader to avoid unnecessary noise in the patcheset.)

> support querying multiple nodes for index scan
> ----------------------------------------------
>                 Key: CASSANDRA-1156
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>            Reporter: Jonathan Ellis
>             Fix For: 0.7 beta 1
>         Attachments: 0001-update-thrift.txt, 0002-handle-index-scans-across-multiple-nodes-and-consisten.txt
> given CASSANDRA-1155, we should query multiple nodes for the rows corresponding to the
given index criteria, such that we have a 90% chance of getting enough rows w/o having to
do another query (but, if our estimate is incorrect, we do need to loop and do a 2nd query).
> we start with the first node in token order, so that we only have to query a single node
for low cardinality (i.e., every index value has many rows associated with it).  we do this
by ordering the keys in the index row, in partitioner order.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message