cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christian Spriegel (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-7886) Coordinator should not wait for read timeouts when replicas hit Exceptions
Date Mon, 15 Dec 2014 11:58:15 GMT


Christian Spriegel commented on CASSANDRA-7886:

Hi [~thobbs], sorry I kept you waiting for so long.

{quote}Instead of using Unavailable when the protocol version is less than 4, use ReadTimeout.
Unavailable signals that some of the replicas are considered to be down, which is not the
case here. Plus, ReadTimeout is the error that is currently returned in these circumstances.{quote}
Makes sense. I changed Unavailable to ReadTimeout for CQL3 and Thrift.

{quote}In ErrorMessage.encodedSize(), there's some commented out code for READ_FAILURE handling.{quote}
The commented code was meant as a preparation for WriteFailureExceptions. Does it perhaps
make sense to fully add WriteFailureException? As a follow up ticket, we could implement it
then for the different writes. Or do you want me to get rid it?

{quote}Instead of catching and ignoring TombstoneOverwhelmingException in multiple places,
I suggest you move the logged error message into the TOE message and let it propagate (and
be logged) like any other exception.{quote}
Just to make sure that we dont touch anything new here: TOEs are logged inside SliceQueryFilter.collectReducedColumns
already. I simply took this catch block from the ReadVerbHandler/RangeSliceVerbHandler and
put into StorageProxy/MessageDeliveryTask.
I don't like that either, but I did not want to touch it. Do you still want me to change it?

{quote}Can you update docs/native_protocol_v4.spec with these changes? You can look at the
previous specs to see examples of the "changes from the previous version" section{quote}
Ok. Should we also add WriteFailures?

{quote}In StorageProxy, the unavailables counter should not be incremented for read failures.
I suggest creating a new, separate failure counter.{quote}

{quote}Also in StorageProxy, there's now quite a bit of code duplication around building error
messages for ReadTimeoutExceptions and ReadFailureExceptions. Can you condense those somewhat?{quote}
I merged ReadTimeoutException|ReadFailureException into a single catch block.

I also added the last cell-name to the TOE, so that an administrator can get an estimate where
to look for the tombstones. This doesn't really match the tickets new name, but is related
to my original issue :-)

Overall, one question remains from my side: Should I also prepare WriteFailureExceptions?
I could (as a follow-up ticket) add these to the write-codepath.

> Coordinator should not wait for read timeouts when replicas hit Exceptions
> --------------------------------------------------------------------------
>                 Key: CASSANDRA-7886
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>         Environment: Tested with Cassandra 2.0.8
>            Reporter: Christian Spriegel
>            Assignee: Christian Spriegel
>            Priority: Minor
>              Labels: protocolv4
>             Fix For: 3.0
>         Attachments: 7886_v1.txt, 7886_v2_trunk.txt, 7886_v3_trunk.txt
> *Issue*
> When you have TombstoneOverwhelmingExceptions occuring in queries, this will cause the
query to be simply dropped on every data-node, but no response is sent back to the coordinator.
Instead the coordinator waits for the specified read_request_timeout_in_ms.
> On the application side this can cause memory issues, since the application is waiting
for the timeout interval for every request.Therefore, if our application runs into TombstoneOverwhelmingExceptions,
then (sooner or later) our entire application cluster goes down :-(
> *Proposed solution*
> I think the data nodes should send a error message to the coordinator when they run into
a TombstoneOverwhelmingException. Then the coordinator does not have to wait for the timeout-interval.

This message was sent by Atlassian JIRA

View raw message