cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Schuller (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-2540) Data reads by default
Date Fri, 22 Apr 2011 10:02:05 GMT


Peter Schuller commented on CASSANDRA-2540:

At least to me, "significantly improved latency" refers to the expected behavior in the event
of a single node being slow or individual messages being dropped, etc. At least to me, this
isn't about avoiding another network round-trip to improve latency by a few milliseconds (or
at least that is a small part of it), but rather about experiencing a much more consistent
latency over time by removing outliers.

Nodes doing GC, temporarily being saturated and dropping messages (e.g., just came up with
cold caches), being killed by an operator (crash-only) are examples of events that tend to
happen to individual nodes (at least not on multiple in a co-ordinated fashion) that will
cause a large amount of requests to suddenly have extremely poor latency (causing e.g. spikes
in concurrency in the application using the cluster).

In that way, the aim isn't (to me again, maybe I'm mis-interpreting Stu) to optimize for digest
mismatches - but rather to optimize for the node that happened to be picked for the data read
being slow or down.

But I totally agree about fat columns (and of course especially in the multi-DC case). So,
there are definitely use-cases for digest reads.

> Data reads by default
> ---------------------
>                 Key: CASSANDRA-2540
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stu Hood
>             Fix For: 0.8.0
> The intention of digest vs data reads is to save bandwidth in the read path at the cost
of latency, but I expect that this has been a premature optimization.
> * Data requested by a read will often be within an order of magnitude of the digest size,
and a failed digest means extra roundtrips, more bandwidth
> * The [digest reads but not your data read|]
problem means failing QUORUM reads because a single node is unavailable, and would require
eagerly re-requesting at some fraction of your timeout
> * Saving bandwidth in cross datacenter usecases comes at huge cost to latency, but since
both constraints change proportionally (enough), the tradeoff is not clear
> Some options:
> # Add an option to use digest reads
> # Remove digest reads entirely (and/or punt and make them a runtime optimization based
on data size in the future)
> # Continue to use digest reads, but send them to {{N - R}} nodes for (somewhat) more
predicatable behavior with QUORUM
> \\
> The outcome of data-reads-by-default should be significantly improved latency, with a
moderate increase in bandwidth usage for large reads.

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message