cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: GMFD messages
Date Thu, 27 May 2010 20:19:39 GMT
Yes, Gossip goes through MD too.

On Thu, May 27, 2010 at 11:03 AM, Anthony Molinaro
<anthonym@alumni.caltech.edu> wrote:
>
> On Thu, May 27, 2010 at 08:04:18AM -0600, Jonathan Ellis wrote:
>> This is a relic of when Gossip was over UDP and had to worry about
>> packet size.  I created
>> https://issues.apache.org/jira/browse/CASSANDRA-1138 to remove those
>> notifications.
>
> Ahh, okay, well its odd that a limit was set even with UDP.  I send large
> UDP packets all the time with LWES and don't have many issues, but glad
> to hear it will be fixed (I may patch locally a larger packet size as
> a short term workaround).  Looking at the code it seems like if you hit
> either of these notifications the message is not serialized (ie serialize
> calls return false), would this explain why if I restart a machine in the
> cluster in this state it only sees some of the ring?
>
> In other words maybe with a fresh restart of everything, there is some
> part of the serialized message which is small enough that all 27 machines
> can be in there, however, once they've been running for a little bit they
> start to creep over the limit, then suddenly gossiping starts to fail
> as responses from some nodes are never sent, and I start seeing inconsistency
> in the rings?
>
> I think this hypothesis could be tested by just increasing the MAX size
> so I think I will try that.
>
>> I think the correlation with MessageDeserializer is a red herring.
>> Gossip only happens once per second so I don't see how that could back
>> MD up.
>
> Yeah, I couldn't see either, just the 'Stopping deserialization' message
> made me think it might (as only the nodes with a backed up MessageDeserializer
> had that message).  Do gossip messages flow through the MessageDeserializer?
>
> Thanks for the response,
>
> -Anthony
>
>> On Tue, May 25, 2010 at 5:33 PM, Anthony Molinaro
>> <anthonym@alumni.caltech.edu> wrote:
>> > Hi,
>> >
>> >  I just noticed I have lots of these messages
>> >
>> > INFO [GMFD:1] 2010-05-25 23:21:04,070 GossipDigestSynMessage.java (line 152)
>> >  Remaining bytes zero. Stopping deserialization in EndPointState.
>> > INFO [GMFD:1] 2010-05-25 23:21:05,224 GossipDigestSynMessage.java (line 129)
>> >  @@@@ Breaking out to respect the MTU size in EPS. Estimate is 56 @@@@
>> >
>> > The first message only occurs on some machines in my cluster.  The second
>> > on all of them.
>> >
>> > The ones with the first message seem to be building up quite a backlog
>> > in their MessageDeserializer PendingTasks.
>> >
>> > I assume there is a correlation, what could be causing this sort of thing?
>> >
>> > This cluster is now at 27 m1.xlarge boxes on ec2 running 0.6.2 of some flavor.
>> >
>> > Thanks,
>> >
>> > -Anthony
>> >
>> > --
>> > ------------------------------------------------------------------------
>> > Anthony Molinaro                           <anthonym@alumni.caltech.edu>
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <anthonym@alumni.caltech.edu>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Mime
View raw message