cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <>
Subject Re: Slow down of secondary index query with VNODE (C* version 1.2.18, jre6).
Date Fri, 19 Sep 2014 22:28:43 GMT
On Fri, Sep 19, 2014 at 4:53 PM, Jay Patel <> wrote:

> When coordinator fires indexed scan request to node, why
> don't it ask that node to check all of its (at least primary) ranges for
> the queried data, at once. Also, internally that node should be able to
> just do one scan through all of the ranges held by it, isn't it?
> (e.g. [min(-9223372036854775808), max(-9193352069377957523), and
> (max(-9136021049555745100), max(-8959555493872108621)], etc. ]
> Seems like it needs to query data in token order. So,
> min(-9223372036854775808), max(-*9193352069377957523*) on
> But next range ((max(-*9193352069377957523*), max(-*9136021049555745100*)])
> is on so fire query there. Then, next range  (max(-
> *9136021049555745100*), max(-8959555493872108621)] again on
> Btw,, I'm not too sure regarding min/max or max/max in trace
> output.

The coordinator certainly could batch multiple range requests that are
going to the same replica.  It's an optimization that would primarily help
the empty table/high cardinality case, but you're welcome to open a
ticket.  3.0 is the earliest this would make it in.

> I found below comment in
> "The problem is that we have to scan the nodes in token order so we dont
> break the existing API's, if we do so then we are sending a lot more
> requests and waiting for the response than the number of nodes. "
> Don't understand the restriction though - "don't break the existing API's".

I think he's just saying that we have to make sure we return results in
token order (and if there's a limit on the query, return the first N
results when listed in token order).

> With non-vnode, it only queries a particular node only one time..Btw, in
> the worst case, I understand secondary index query has to scan all the
> nodes in cluster sometime (empty table or high cardinality index?) but I
> don't understand why vnode makes it to scan the *same node *multiple
> times. If RF is 1, then also I see this behavior.
> >> Snippet from output1.txt attached earlier:
> Executing indexed scan for [min(-9223372036854775808),
> max(-9193352069377957523)] | 23:11:30,992 | |
> Executing indexed scan for (max(-9193352069377957523),
> max(-9136021049555745100)] | 23:11:30,998 | |
> Executing indexed scan for (max(-9136021049555745100),
> max(-8959555493872108621)] | 23:11:30,999 | |
> Executing indexed scan for (max(-8959555493872108621),
> max(-8929774302283364912)] | 23:11:31,000 | |

I'm not sure how your question here is different from the one above.

Tyler Hobbs
DataStax <>

View raw message