mesos-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joseph Wu <jos...@mesosphere.io>
Subject Review Request 52181: WIP: Added synchronization in link logic to prevent relinking races.
Date Fri, 23 Sep 2016 21:54:49 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52181/
-----------------------------------------------------------

Review request for mesos, Benjamin Mahler, Artem Harutyunyan, and Joris Van Remoortere.


Bugs: MESOS-6234
    https://issues.apache.org/jira/browse/MESOS-6234


Repository: mesos


Description
-------

There is a general pattern in the `SocketManager` in which most methods
will grab the mutex and then check if the socket to manage exists in
the `SocketManager`s mapping.  If the socket does not exist, the
`SocketManager` silently returns.

This adds similar logic inside two critical sections of the `link`
codepath.  If there are multiple calls to `link` in-flight at once,
this prevents sockets from being leaked into unmanaged callback loops.


Diffs
-----

  3rdparty/libprocess/src/process.cpp 02a192529e53479d5a163fa6a20873674b51ee2c 
  3rdparty/libprocess/src/tests/process_tests.cpp b9feec7e34cffe19e49035f8865b150f79258f54


Diff: https://reviews.apache.org/r/52181/diff/


Testing
-------

make check

3rdparty/libprocess/libprocess-tests --gtest_filter="ProcessRemoteLinkTest.RemoteRelinkLoop"
--gtest_repeat=5000 --gtest_break_on_failure

Investigating a consistent failure after ~80 iterations:
```
I0923 13:26:08.611300 2060546816 process.cpp:1446] Failed to link, connect: Failed to connect
to 0.0.0.0:51459: Can't assign requested address
```


Thanks,

Joseph Wu


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message