cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-6976) Determining replicas to query is very slow with large numbers of nodes or vnodes
Date Mon, 01 Dec 2014 19:21:13 GMT


Benedict commented on CASSANDRA-6976:

bq.  I don't see a reason to drop it just because the ticket got caught up in implementation
details and not the user facing issue we want to address.

Well, given the test case that originally produced this concern almost certainly had the same
methodology you had, I suspect you did indeed track down the problem to a non-warm JVM

bq. The entire thing runs in 60 milliseconds with 2000 tokens. That is 2x the time to warm
up the cache (assuming a correct number for warmup). 

You're assuming that (1) the cache stays warm in normal operation and (2) that the warmup
figures you have are for similar data distributions and (3) the warmup is simply a matter
of presence in cache, rather than likelihood of eviction (4) all this behaviour has no negative
impact outside of the method itself. But, like I said, I agree it won't likely make an order
of magnitude difference by itself. Especially not with current state of C*.

bq. Range queries are slow because they produce a lot of ranges.

Did we determine that if the _result_ is a narrow range the performance is significantly faster?
Because this stemmed from a situation where the entire contents were known to be node-local
(because the data was local only, it wasn't actually distributed). I wouldn't be at all surprised
if it was fine, given the likely cause you tracked down, but I don't think we actually demonstrated

bq. What queries could identify that this shortcut is possible?

I am referring here to the more general case of getLiveSortedEndpoints, which is used much
more widely. But, like I said, I raised this largely because of a general bugging that this
whole area of code has many inefficiencies, not because it is likely they really matter. The
only thing actionable is that we *should* take steps to ensure our default (and common) test
and benchmark configs more accurately represent real cluster configs because we simply do
not exercise these codepaths right now from a performance perspective.

> Determining replicas to query is very slow with large numbers of nodes or vnodes
> --------------------------------------------------------------------------------
>                 Key: CASSANDRA-6976
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Benedict
>            Assignee: Ariel Weisberg
>              Labels: performance
>         Attachments:, jmh_output.txt, jmh_output_murmur3.txt,
> As described in CASSANDRA-6906, this can be ~100ms for a relatively small cluster with
vnodes, which is longer than it will spend in transit on the network. This should be much

This message was sent by Atlassian JIRA

View raw message