cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremiah Jordan (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-1337) parallelize fetching rows for low-cardinality indexes and range scans of small tables
Date Wed, 13 Nov 2013 22:23:21 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeremiah Jordan updated CASSANDRA-1337:
---------------------------------------

    Description: 
currently, we read the indexed rows (and rows for a range scan) from the first node (in partitioner
order); if that does not have enough matching rows, we read the rows from the next, and so
forth.

we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel, such
that we have a high chance of getting enough rows w/o having to do another round of queries
(but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough
data or we have fetched from each node).


  was:
currently, we read the indexed rows from the first node (in partitioner order); if that does
not have enough matching rows, we read the rows from the next, and so forth.

we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel, such
that we have a high chance of getting enough rows w/o having to do another round of queries
(but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough
data or we have fetched from each node).



> parallelize fetching rows for low-cardinality indexes and range scans of small tables
> -------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-1337
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1337
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Tyler Hobbs
>            Priority: Minor
>             Fix For: 2.1
>
>         Attachments: 1137-bugfix.patch, 1337-v4.patch, 1337-v5.patch, 1337.patch, ASF.LICENSE.NOT.GRANTED--0001-CASSANDRA-1337-scan-concurrently-depending-on-num-rows.txt,
CASSANDRA-1337.patch
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> currently, we read the indexed rows (and rows for a range scan) from the first node (in
partitioner order); if that does not have enough matching rows, we read the rows from the
next, and so forth.
> we should use the statistics fom CASSANDRA-1155 to query multiple nodes in parallel,
such that we have a high chance of getting enough rows w/o having to do another round of queries
(but, if our estimate is incorrect, we do need to loop and do more rounds until we have enough
data or we have fetched from each node).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message