geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Schuchardt <bschucha...@pivotal.io>
Subject Re: Review Request 60570: GEODE-3153 Client receives duplicate events during rolling upgrade
Date Fri, 30 Jun 2017 23:15:13 GMT


> On June 30, 2017, 2:56 p.m., Galen O'Sullivan wrote:
> > I've been looking at this with @WireBaron, and we're wondering whether a client
membership ID can still get sent with a zeroed UUID if it's passed between two Gemfire 9.1
servers as a result of client queue replication failover. We tried to write a test and failed.
> > 
> > The basic idea is something like this:
> > * start two servers, an interested client and an event-creating client.
> >   One server is running 9.0 and the other 9.1 .
> > * put a couple of events in the system via the 9.0 server (should be fine with either).
> > * kill the 9.0 server and add a new 9.1 server to the system.
> >   At this point, if we're understanding client queue replication correctly, the
new server should receive a copy of the queue from the other 9.1 server.
> > * Check the same events on the new server to see if they've lost the UUID.
> > 
> > Does that sound reasonable to you?
> > 
> > I'm not sure how to trigger failures in the right order to verify this isn't an
issue, but I think it's reasonably plausible that during rolling upgrades someone could encounter
the issue. It would require an old client to get a queue from a new version server that has
been passed that queue by another new version server.
> > 
> > If that's not possible to trigger, or you can't test it and are confident that the
scenario we described won't happen, then go ahead and ship it.

Please use Geode version numbers.

I modified my new test to do as you described and it passed, as I expected it to.  During
image transfer the first 1.2.0 server would just transmit the membershipID bytes in its toData
method since the target is a 1.2.0 server.  The second server would preserve the UUID bytes
when sending the eventID to the 1.0.0 client.


- Bruce


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/60570/#review179384
-----------------------------------------------------------


On June 30, 2017, 3:02 p.m., Bruce Schuchardt wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/60570/
> -----------------------------------------------------------
> 
> (Updated June 30, 2017, 3:02 p.m.)
> 
> 
> Review request for geode, Alexander Murmann, Barry Oglesby, Galen O'Sullivan, Hitesh
Khamesra, and Brian Rowe.
> 
> 
> Bugs: GEODE-3153
>     https://issues.apache.org/jira/browse/GEODE-3153
> 
> 
> Repository: geode
> 
> 
> Description
> -------
> 
> Another problem was found in backward-compatibility testing.  If a 1.0.0 client was receiving
subscription events generated by a 1.0.0 peer "feeder" member and the events were routed through
a 1.0.0 server the client might see duplicate events when the server is stopped and the client
fails over to a 1.2.0 server holding its redundant subscription queue.  This is especially
possible if a large "ack" period is established in the client.
> 
> The problem stems from the EventID deserialization/reserialization of the memberID bytes
when sending to a 1.0 client.  It was deserializing using Version.CURRENT, which ignores the
UUID bytes in the serialized ID.  Then it serialized the identifier using the client's version,
which includes the UUID bytes but which are zero due to the version used in deserialization.
> 
> 
> Diffs
> -----
> 
>   geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/membership/GMSJoinLeave.java
bc3d708da2ae9a8e386accb8d15e2ed49123241e 
>   geode-core/src/main/java/org/apache/geode/internal/Version.java 557697159da644915e4ffe2405cdddc9ef37c5ac

>   geode-core/src/main/java/org/apache/geode/internal/cache/EventID.java 55c89f1f2e0800371dd4a30c4312c44f942a45ea

>   geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscBCDUnitTest.java
bc48d976096fafe2545e707da68dab5120ddca51 
>   geode-core/src/test/java/org/apache/geode/internal/cache/tier/sockets/ClientServerMiscDUnitTest.java
bfe4646b9abdf6075e8d30fab3d79924faade2aa 
>   geode-core/src/test/resources/org/apache/geode/codeAnalysis/sanctionedDataSerializables.txt
b69e004d63d74eccd5cd562ea269363ba3f2782e 
> 
> 
> Diff: https://reviews.apache.org/r/60570/diff/2/
> 
> 
> Testing
> -------
> 
> new unit tests, precheckin
> 
> 
> Thanks,
> 
> Bruce Schuchardt
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message