cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes
Date Mon, 30 Jul 2012 07:54:36 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne reopened CASSANDRA-1337:
-----------------------------------------


I'm sorry but I still don't see how that handle correctly 2ndary indexes (in the non-CQL3
case, fine for putting CQL3 to a following ticket if you prefer).

Also, the parallelHandlers integer of the "fix" doesn't really work. Suppose you have a concurrencyFactor
of 2 but > 2 ranges. If the first range happens to be a local range, you'll decrement parallelHandlers
and so the following queries we be done serially, while they should be done 2 by 2 in parallel.
 Besides, why is the local case a special case in the first place? It feels like if we're
parallelizing queries, there is no reason not to parallelize the local path too.

                
> parallelize fetching rows for low-cardinality indexes
> -----------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: David Alves
>             Fix For: 1.2
>
>         Attachments: 0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt,
1137-bugfix.patch, CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows from the first node (in partitioner order); if that
does not have enough matching rows, we read the rows from the next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel,
such that we have a high chance of getting enough rows w/o having to do another round of queries
(but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough
data or we have fetched from each node).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message