cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Brown (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-14575) Reevaluate when to drop an internode connection on message error
Date Fri, 20 Jul 2018 12:48:00 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-14575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16550730#comment-16550730
] 

Jason Brown edited comment on CASSANDRA-14575 at 7/20/18 12:47 PM:
-------------------------------------------------------------------

[~iamaleksey] mentioned this in CASSANDRA-14574
{quote}As for when it's safe to ignore a failed to deser message - at least in the case of
unknown table id it is, and that's a common enough scenario. Think someone creates a table
and starts writes before waiting for schema to propagate. Or batchlog replays a mutation to
a node on which a table is either not yet known, or has been dropped since. Or, occasionally,
when we add new tables and use them during mixed mode/upgrade period.
{quote}
Yup, at a minimum I think we can handle this one here, as we already know the payload length,
and it's easy to skip beyond this message. Possibly all payload deser failures are ok are
to ignore, as long as we can skip the payload bytes.

If, however, we fail in parsing any other part of the message (magic, params, and so on),
we're pretty much screwed and will need to shut down the connection.

ftr, in CASSANDRA-14447, if we receive a verb id we do not recognize, we'll still deserialize
the message ([payload is skipped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageIn.java#L150],
as we have no serializer for the verb id we don't know), and the message [will be dropped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageDeliveryTask.java#L66]
in {{MessageDeliveryTask.process()}}. Thus, we don't need to worry about unknown verbs for
this ticket.

Reagrding params out of the message, with CASSANDRA-7544, we introduced an enum for all parameter
names, {{ParameterType}}. In {{MessageInHandler.readParams()}}, we [look up the {{ParameterType}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/async/MessageInHandler.java#L160]
based on the string we read from the stream. If the string is unknown (it came from a future
version), the lookup returns null. -Several lines later we'll try to insert that null key
into the {{EnumMap}}, and we'll get an exception and end up closing the connection. It's simple
enough to still deserialize the key and value, and just not insert into the map if the key
is null (effectively ignoring the param). This should be handled by this ticket, as well.-

-CORRECTION: we need to know the {{parameterType}} to deserialize the value. Thus we have
to fail the connection as we won't know the length of the value to read from the buf-.

Ughh, I need more coffee. CORRECTION 2: We *do* know the length of the value to read, we just
won't know how to deserialize it. Thus, we can just skip the deser and inserting into the
map step (skipping the value's bytes), and retain the connection as we'll still be in a deterministic
state of the byte stream.


was (Author: jasobrown):
[~iamaleksey] mentioned this in CASSANDRA-14574
{quote}As for when it's safe to ignore a failed to deser message - at least in the case of
unknown table id it is, and that's a common enough scenario. Think someone creates a table
and starts writes before waiting for schema to propagate. Or batchlog replays a mutation to
a node on which a table is either not yet known, or has been dropped since. Or, occasionally,
when we add new tables and use them during mixed mode/upgrade period.
{quote}
Yup, at a minimum I think we can handle this one here, as we already know the payload length,
and it's easy to skip beyond this message. Possibly all payload deser failures are ok are
to ignore, as long as we can skip the payload bytes.

If, however, we fail in parsing any other part of the message (magic, params, and so on),
we're pretty much screwed and will need to shut down the connection.

ftr, in CASSANDRA-14447, if we receive a verb id we do not recognize, we'll still deserialize
the message ([payload is skipped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageIn.java#L150],
as we have no serializer for the verb id we don't know), and the message [will be dropped|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/MessageDeliveryTask.java#L66]
in {{MessageDeliveryTask.process()}}. Thus, we don't need to worry about unknown verbs for
this ticket.

Reagrding params out of the message, with CASSANDRA-7544, we introduced an enum for all parameter
names, {{ParameterType}}. In {{MessageInHandler.readParams()}}, we [look up the {{ParameterType}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/net/async/MessageInHandler.java#L160]
based on the string we read from the stream. If the string is unknown (it came from a future
version), the lookup returns null. -Several lines later we'll try to insert that null key
into the {{EnumMap}}, and we'll get an exception and end up closing the connection. It's simple
enough to still deserialize the key and value, and just not insert into the map if the key
is null (effectively ignoring the param). This should be handled by this ticket, as well.-


 -CORRECTION: we need to know the {{parameterType}} to deserialize the value. Thus we have
to fail the connection as we won't know the length of the value to read from the buf-.

Ughh, I need more coffee. CORRECTION 2: We *do* know the length of the value to read, we just
won't know how to deserialize it. Thus, we can just the deser and inserting into the map step,
and retain the connection as we'll still be in a deterministic state of the byte stream.

> Reevaluate when to drop an internode connection on message error
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-14575
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14575
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jason Brown
>            Assignee: Jason Brown
>            Priority: Minor
>             Fix For: 4.0
>
>
> As mentioned in CASSANDRA-14574, explore if and when we can safely ignore an incoming
internode message on certain classes of failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message