river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Dolan" <christopher.do...@avid.com>
Subject RE: client hang in com.sun.jini.jeri.internal.mux.Mux.start()
Date Wed, 04 May 2011 18:05:26 GMT
Here's a test that consistently fails with the current Mux
implementation and passes with the patch I proposed at the beginning of
this thread. In my test I explicitly pretend that the server side of the
connect has blocked. In reality, all we need to agree on is that it's
possible for the server side to block.

The proposed patch needs a little more work to make the timeout be
configurable. If so, the test can be sped up by setting that timeout to
something unrealistically short.

public class MuxStartTimeout {
    public void test() throws IOException, InterruptedException {
        // make fake input and output streams.
        OutputStream os = new ByteArrayOutputStream();
        InputStream is = new InputStream() {
            public synchronized int read() throws IOException {
                try {
                    // block indefinitely
                    while (true)
                } catch (InterruptedException e) {
                    return 0;

        final AtomicBoolean finished = new AtomicBoolean(false);
        final AtomicBoolean succeeded = new AtomicBoolean(false);
        final AtomicBoolean failed = new AtomicBoolean(false);
        final MuxClient muxClient = new MuxClient(os, is);
        try {
            Thread t = new Thread(new Runnable() {
                public void run() {
                    try {
                    } catch (IOException e) {
            if (!t.isInterrupted())
        } finally {
            muxClient.shutdown("end of test");


P.S. Amusingly, I actually compiled the test against
org.testng.annotations.Test org.testng.Assert but it should also work as
written against org.junit.Test and org.junit.Assert

-----Original Message-----
From: Patricia Shanahan [mailto:pats@acm.org] 
Sent: Wednesday, May 04, 2011 11:24 AM
To: dev@river.apache.org
Subject: Re: client hang in com.sun.jini.jeri.internal.mux.Mux.start()

This raises a more general question that has been troubling me: What 
should we do about theoretical deadlocks and similar concurrency issues 
that have not been demonstrated in practice?

On the one hand, I like to have a test to show that a change really 
fixed something. On the other hand, a concurrency problem can contribute

to general flakiness without ever reaching the point of being reported 
as a bug or having a test that demonstrates it.


On 5/4/2011 8:47 AM, Christopher Dolan wrote:
> I haven't conclusively witnessed that specific deadlock, but I've had
> closely related problem where another process coincidentally grabs
> 4160 before Reggie gets it. This happens because Win2k, WinXP and
> use 1024-5000 for their dynamic port range, contrary to IANA
> recommendations. I suspect the deadlock described above happens in
> life, but I've never gotten detailed enough logs to prove it, just
> client stack traces showing the hang in Mux.start().

View raw message