cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tyler Hobbs (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-12311) Propagate TombstoneOverwhelmingException to the client
Date Tue, 02 Aug 2016 23:35:20 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15405018#comment-15405018
] 

Tyler Hobbs edited comment on CASSANDRA-12311 at 8/2/16 11:34 PM:
------------------------------------------------------------------

I think the idea of generalizing this to support error codes is good.  However, I think we
should sort a few things out ahead of time.

Ideally, we will have many more error codes than just the one for tombstone overwhelming.
 If these are described as part of the native protocol spec, then new error codes should only
really be introduced in new native protocol versions.  However, that seems like it might be
overly restrictive.  I wonder if perhaps the native protocol spec should just say "this is
a one-byte error code; for the meaning of the error code, look at <link to cassandra docs
page>".  Of course, this means that drivers probably would _not_ handle error codes in
a fancy way (such as tying error messages to particular codes), which is a downside.  One
upside to error codes is that they are easily googleable, though, so users could presumably
figure out the meaning quickly.

Second, we may want to combine this improvement with another one that I've been thinking of.
 Instead of having a single byte error code, we should return a map of endpoints to failure
codes.  Besides handling multiple types of failures correctly, this would let users know which
replica nodes actually had problems, which is something the current errors don't do.

Third, I think we should go with a two-byte error code.  It's used rarely, so the space doesn't
matter, and a single byte may become restrictive over time.

-Last, I haven't had time to verify this, but it seems like the messaging service changes
may have to wait until 4.0?  I'm not sure if new parameters are handled gracefully by nodes
that don't know them yet.- *EDIT* yeah, it's already marked for 4.x.


was (Author: thobbs):
I think the idea of generalizing this to support error codes is good.  However, I think we
should sort a few things out ahead of time.

Ideally, we will have many more error codes than just the one for tombstone overwhelming.
 If these are described as part of the native protocol spec, then new error codes should only
really be introduced in new native protocol versions.  However, that seems like it might be
overly restrictive.  I wonder if perhaps the native protocol spec should just say "this is
a one-byte error code; for the meaning of the error code, look at <link to cassandra docs
page>".  Of course, this means that drivers probably would _not_ handle error codes in
a fancy way (such as tying error messages to particular codes), which is a downside.  One
upside to error codes is that they are easily googleable, though, so users could presumably
figure out the meaning quickly.

Second, we may want to combine this improvement with another one that I've been thinking of.
 Instead of having a single byte error code, we should return a map of endpoints to failure
codes.  Besides handling multiple types of failures correctly, this would let users know which
replica nodes actually had problems, which is something the current errors don't do.

Third, I think we should go with a two-byte error code.  It's used rarely, so the space doesn't
matter, and a single byte may become restrictive over time.

Last, I haven't had time to verify this, but it seems like the messaging service changes may
have to wait until 4.0?  I'm not sure if new parameters are handled gracefully by nodes that
don't know them yet.

> Propagate TombstoneOverwhelmingException to the client
> ------------------------------------------------------
>
>                 Key: CASSANDRA-12311
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12311
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Geoffrey Yu
>            Assignee: Geoffrey Yu
>            Priority: Minor
>             Fix For: 4.x
>
>         Attachments: 12311-trunk-v2.txt, 12311-trunk.txt
>
>
> Right now if a data node fails to perform a read because it ran into a {{TombstoneOverwhelmingException}},
it only responds back to the coordinator node with a generic failure. Under this scheme, the
coordinator won't be able to know exactly why the request failed and subsequently the client
only gets a generic {{ReadFailureException}}. It would be useful to inform the client that
their read failed because we read too many tombstones. We should have the data nodes reply
with a failure type so the coordinator can pass this information to the client.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message