qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adel Boutros <adelbout...@live.com>
Subject Re: [Qpid Dispatch - 0.7.0] Random failure on unit test "system_tests_link_routes" on Linux
Date Fri, 03 Mar 2017 20:42:35 GMT
Great!

I had stumbled on this issue by coincidence while debugging another one on Solaris.

It might be useful to execute the unit tests multiple times on the CI to detect such random
issues. It's what we are currently doing and it had revealed some bugs for us over time.

Regards,
Adel

________________________________
From: Ganesh Murthy <gmurthy@redhat.com>
Sent: Friday, March 3, 2017 8:16:15 PM
To: users@qpid.apache.org
Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test "system_tests_link_routes"
on Linux

Hi Adel,
    We found the reason the test was failing. I have entered a JIRA for this -

https://issues.apache.org/jira/browse/DISPATCH-646

The fix to this problem is not straightforward. We are still discussing this. This only affects
drain processing on link routes. This test can be reproduced more frequently if you remove
'workerThreads': 1 from system_test.py which will default workerThreads to 4.

Thanks for bringing this to our attention.

Thanks.

----- Original Message -----
> From: "Adel Boutros" <Adelboutros@live.com>
> To: users@qpid.apache.org
> Sent: Thursday, March 2, 2017 10:48:04 AM
> Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test "system_tests_link_routes"
on Linux
>
> Great!
>
>
> Good luck [?]
>
> Adel
>
> ________________________________
> From: Ganesh Murthy <gmurthy@redhat.com>
> Sent: Thursday, March 2, 2017 4:18:54 PM
> To: users@qpid.apache.org
> Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> "system_tests_link_routes" on Linux
>
> Hi Adel,
>    Luckily, I was able to reproduce your problem locally. I will keep you
>    posted on my findings.
> Thanks.
>
> ----- Original Message -----
> > From: "Adel Boutros" <Adelboutros@live.com>
> > To: users@qpid.apache.org
> > Sent: Wednesday, March 1, 2017 3:11:59 PM
> > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > "system_tests_link_routes" on Linux
> >
> > Hello Ganesh,
> >
> >
> > It seems that "zip" files are not allowed and thus my mail is never getting
> > delivered.
> >
> >
> > You can get it from here: http://www.filedropper.com/test_210
> >
> >
> >
> > Regards,
> >
> > Adel
> >
> > ________________________________
> > From: Adel Boutros
> > Sent: Wednesday, March 1, 2017 9:09:29 PM
> > To: users@qpid.apache.org
> > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > "system_tests_link_routes" on Linux
> >
> >
> > It seems for some reason my mail wasn't sent. I am re-attaching the output
> > here.
> >
> >
> > Can you confirm you got it now?
> >
> >
> > Regards,
> >
> > Adel
> >
> > ________________________________
> > From: Ganesh Murthy <gmurthy@redhat.com>
> > Sent: Wednesday, March 1, 2017 9:05:43 PM
> > To: users@qpid.apache.org
> > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > "system_tests_link_routes" on Linux
> >
> > Hi Adel,
> >    I sent you a subsequent email asking you for the log files. Did you
> >    already respond to that email? For some reason, I am not seeing your
> >    response. I see only your first response where you attached the output
> >    of
> >    PN_TRACE_FRM=1
> >
> > Can you please resend the log files?
> >
> > Sorry, Thanks.
> >
> > ----- Original Message -----
> > > From: "Adel Boutros" <Adelboutros@live.com>
> > > To: users@qpid.apache.org
> > > Sent: Wednesday, March 1, 2017 2:55:21 PM
> > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > "system_tests_link_routes" on Linux
> > >
> > > Hello Ganesh,
> > >
> > >
> > > Did you have any luck with what I sent you?
> > >
> > >
> > > Regards,
> > >
> > > Adel
> > >
> > > ________________________________
> > > From: Adel Boutros
> > > Sent: Wednesday, March 1, 2017 9:56:44 AM
> > > To: users@qpid.apache.org
> > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > "system_tests_link_routes" on Linux
> > >
> > >
> > > No worries, it's ok :)
> > >
> > >
> > > You will find attached the requested content.
> > >
> > >
> > > Regards,
> > >
> > > Adel
> > >
> > > ________________________________
> > > From: Ganesh Murthy <gmurthy@redhat.com>
> > > Sent: Tuesday, February 28, 2017 8:24:20 PM
> > > To: users@qpid.apache.org
> > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > "system_tests_link_routes" on Linux
> > >
> > > Hi Adel,
> > >    Thanks for sending the trace. I realized that what I actually wanted
> > >    to
> > >    see is the log files of all routers so I can get a complete view of
> > >    the
> > >    traffic between the routers. Can you please zip up the contents of the
> > >    <qpid-dispatch-install-folder>/build/system_test.dir/system_tests_link_routes/LinkRouteTest/setUpClass
> > >    folder and send it?
> > >
> > > Before you zip up the contents of that folder, please delete all the
> > > contents
> > > of the folder and then run only the relevant test like this -
> > >
> > > /usr/bin/python "<dispatch-install-folder>/build/tests/run.py" "-m"
> > > "unittest" "-v"
> > > "system_tests_link_routes.LinkRouteTest.test_www_drain_support_all_messages"
> > >
> > > (this way, the log files will only have the trace from the relevant test)
> > >
> > > Thanks much.
> > >
> > > ----- Original Message -----
> > > > From: "Adel Boutros" <Adelboutros@live.com>
> > > > To: users@qpid.apache.org
> > > > Sent: Tuesday, February 28, 2017 1:08:53 PM
> > > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > > "system_tests_link_routes" on Linux
> > > >
> > > >
> > > >
> > > > You will find attached the result of the below command.
> > > >
> > > >
> > > >
> > > >
> > > > Adel
> > > >
> > > > From: Ganesh Murthy <gmurthy@redhat.com>
> > > > Sent: Tuesday, February 28, 2017 6:51:42 PM
> > > > To: users@qpid.apache.org
> > > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > > "system_tests_link_routes" on Linux
> > > > Hi Adel,
> > > > Can you please run that specific unit test with PN_TRACE_FRM=1 and send
> > > > the
> > > > output. This is how you run the specific test -
> > > >
> > > > PN_TRACE_FRM=1 /usr/bin/python
> > > > "<dispatch-install-folder>/build/tests/run.py"
> > > > "-m" "unittest" "-v"
> > > > "system_tests_link_routes.LinkRouteTest.test_www_drain_support_all_messages"
> > > >
> > > > Thanks.
> > > >
> > > > ----- Original Message -----
> > > > > From: "Adel Boutros" <Adelboutros@live.com>
> > > > > To: users@qpid.apache.org
> > > > > Sent: Tuesday, February 28, 2017 10:20:42 AM
> > > > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > > > "system_tests_link_routes" on Linux
> > > > >
> > > > > Hi Ganesh,
> > > > >
> > > > >
> > > > > Yes, I had checked your fix but it doesn't seem to work here. Indeed,
> > > > > the
> > > > > test is still failing even when timeout is increased to 100.
> > > > >
> > > > >
> > > > > It seems the drain is always only receiving 8 messages out of 10.
I
> > > > > didn't
> > > > > check it on the trunk however to see if it is fixed.
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Adel
> > > > >
> > > > > ________________________________
> > > > > From: Ganesh Murthy <gmurthy@redhat.com>
> > > > > Sent: Tuesday, February 28, 2017 3:58:37 PM
> > > > > To: users@qpid.apache.org
> > > > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > > > "system_tests_link_routes" on Linux
> > > > >
> > > > > Hi Adel,
> > > > > We did notice the same problem you are seeing and we did end up
> > > > > increasing
> > > > > the timeout from 5 to 10 (although on a different test) as seen in
> > > > > this
> > > > > commit on master branch -
> > > > >
> > > > > https://github.com/apache/qpid-dispatch/commit/5e6b2e65b2ea9614d7619711961d38aceefb49d4
> > > > >
> > > > > Is it correct that even when you increased the timeout to 100, the
> > > > > test
> > > > > still
> > > > > fails randomly?
> > > > >
> > > > > Thanks.
> > > > >
> > > > > ----- Original Message -----
> > > > > > From: "Adel Boutros" <Adelboutros@live.com>
> > > > > > To: users@qpid.apache.org
> > > > > > Sent: Tuesday, February 28, 2017 9:00:36 AM
> > > > > > Subject: Re: [Qpid Dispatch - 0.7.0] Random failure on unit
test
> > > > > > "system_tests_link_routes" on Linux
> > > > > >
> > > > > > I increased the timeout from 5 to 100 and the test which now
takes
> > > > > > 110
> > > > > > seconds instead of 10 seconds is still failing. After re-checking,
> > > > > > the
> > > > > > error
> > > > > > message is actually the same on each failure. I think there
is an
> > > > > > issue
> > > > > > here.
> > > > > >
> > > > > >
> > > > > > My patch:
> > > > > >
> > > > > >
> > > > > > diff --git a/tests/system_tests_drain_support.py
> > > > > > b/tests/system_tests_drain_support.py
> > > > > > index e4bd2bc..9a4a86f 100644
> > > > > > --- a/tests/system_tests_drain_support.py
> > > > > > +++ b/tests/system_tests_drain_support.py
> > > > > > @@ -46,7 +46,7 @@ class DrainMessagesHandler(MessagingHandler):
> > > > > > self.conn.close()
> > > > > >
> > > > > > def on_start(self, event):
> > > > > > - self.timer = event.reactor.schedule(5, Timeout(self))
> > > > > > + self.timer = event.reactor.schedule(100, Timeout(self))
> > > > > > self.conn = event.container.connect(self.address)
> > > > > >
> > > > > > # Create a sender and a receiver. They are both listening on
the
> > > > > > same address
> > > > > >
> > > > > >
> > > > > > ==================
> > > > > >
> > > > > > Error
> > > > > >
> > > > > > ===================
> > > > > >
> > > > > >
> > > > > > 13: FAIL: test_www_drain_support_all_messages
> > > > > > (system_tests_link_routes.LinkRouteTest)
> > > > > > 13:
> > > > > > ----------------------------------------------------------------------
> > > > > > 13: Traceback (most recent call last):
> > > > > > 13: File "/qpid-dispatch-0.7.0/tests/system_tests_link_routes.py",
> > > > > > line
> > > > > > 489, in test_www_drain_support_all_messages
> > > > > > 13: self.assertEqual(None, drain_support.error)
> > > > > > 13: AssertionError: None != 'Timeout Expired: sent: 10 rcvd:
8'
> > > > > > 13:
> > > > > > 13:
> > > > > > ----------------------------------------------------------------------
> > > > > > 13: Ran 15 tests in 109.933s
> > > > > > 13:
> > > > > > 13: FAILED (failures=1)
> > > > > > 1/1 Test #13: system_tests_link_routes .........***Failed 110.16
> > > > > > sec
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Adel
> > > > > >
> > > > > > ________________________________
> > > > > > From: Adel Boutros <Adelboutros@live.com>
> > > > > > Sent: Tuesday, February 28, 2017 2:47:16 PM
> > > > > > To: users@qpid.apache.org
> > > > > > Subject: [Qpid Dispatch - 0.7.0] Random failure on unit test
> > > > > > "system_tests_link_routes" on Linux
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > >
> > > > > > I noticed a random issue with the Dispatch Router 0.7.0 which
has a
> > > > > > random
> > > > > > failure in the test "system_tests_link_routes".
> > > > > >
> > > > > > I launch the same test 10 times, it fails randomly (Failure
occurs
> > > > > > at
> > > > > > the
> > > > > > 4th
> > > > > > or 5th run) but always with a similar error. It seems the timeout
> > > > > > of
> > > > > > 5
> > > > > > milliseconds is not enough especially on slow machines.
> > > > > >
> > > > > >
> > > > > > I had submitted a patch for a similar task
> > > > > > ( https://issues.apache.org/jira/browse/DISPATCH-627 )
> > > > > >
> > > > > >
> > > > > > Can you please confirm?
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Adel
> > > > > >
> > > > > >
> > > > > > 13:
> > > > > > ======================================================================
> > > > > > 13: FAIL: test_www_drain_support_all_messages
> > > > > > (system_tests_link_routes.LinkRouteTest)
> > > > > > 13:
> > > > > > ----------------------------------------------------------------------
> > > > > > 13: Traceback (most recent call last):
> > > > > > 13: File "/qpid-dispatch-0.7.0/tests/system_tests_link_routes.py",
> > > > > > line
> > > > > > 489, in test_www_drain_support_all_messages
> > > > > > 13: self.assertEqual(None, drain_support.error)
> > > > > > 13: AssertionError: None != 'Timeout Expired: sent: 10 rcvd:
8'
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > > ---------------------------------------------------------------------
> > > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > > > For additional commands, e-mail: users-help@qpid.apache.org
> > > > >
> > > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > > For additional commands, e-mail: users-help@qpid.apache.org
> > > >
> > > >
> > > >
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > > For additional commands, e-mail: users-help@qpid.apache.org
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > > For additional commands, e-mail: users-help@qpid.apache.org
> > >
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> > For additional commands, e-mail: users-help@qpid.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message