geode-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruce Schuchardt <bschucha...@pivotal.io>
Subject Review Request 59925: GEODE-3052 Restarting 2 locators within 1s of each other causes potential locator split brain
Date Thu, 08 Jun 2017 18:36:11 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59925/
-----------------------------------------------------------

Review request for geode, Alexander Murmann, Galen O'Sullivan, Hitesh Khamesra, Udo Kohlmeyer,
and Brian Rowe.


Repository: geode


Description
-------

When restarting from a locatorView.dat file we should ignore any locator entries in the view.
 Recovery tries to get this state from other locators before resorting to using the persisted
view so there we know all of the locator entries in the view are invalid.  This allows the
locators to quickly move into the concurrent-startup algorithm and find each other.

I removed the Flaky categorization of the test that I modified to reproduce the problem. 
A subclass's use of the test was reported as a Flaky failure but I found that the ticket was
closed.


Diffs
-----

  geode-core/src/main/java/org/apache/geode/distributed/internal/membership/gms/locator/GMSLocator.java
e3635f2d93aae212cbff2f2058b6dc728a04776a 
  geode-core/src/test/java/org/apache/geode/distributed/LocatorDUnitTest.java 8ff9b67e13dd50499d861ff62ddae3fb8668dd28

  geode-core/src/test/java/org/apache/geode/distributed/LocatorUDPSecurityDUnitTest.java 9d49d30abfb8acccd8a5547ba0ee3c7bcf9e7970



Diff: https://reviews.apache.org/r/59925/diff/1/


Testing
-------

The problem was easily reproduced using LocatorDUnitTest.testStartTwoLocators by repeating
the cycling of the locators.  It failed every time I ran it.


Thanks,

Bruce Schuchardt


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message