harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Volosyuk" <ivan.volos...@gmail.com>
Subject Re: [classlib][luni] signalis interruptus in hysock
Date Thu, 26 Oct 2006 12:42:02 GMT
On 10/25/06, Geir Magnusson Jr. <geir@pobox.com> wrote:
>
>
> Fedotov, Alexei A wrote:
> > Guys,
> >
> > Could you please help me to understand the following?
> >
> > 1. Is HARMONY-1904 actually a duplicate of my HARMONY-1879?
>
> scanning quickly, I don't think so.
>
> > 2. Ivan, do I remember correctly that you've already fixed that bug once
> > when debugging Eclipse long run failures? Where is that patch?
>
> this bug arose when the new TM was added, which uses signals much more
> aggressively.
>
> geir

Well, the bug exists quite a long time and it was reproducible before.
Older TM also used signals for stopping threads for GC. The patch I
have created was not integrated before as it was almost the same as
the current suggested patch. The only difference was that it handled
timeout correctly (for other unixes).
--
Ivan

>
> >
> > Thank you in advance.
> >
> > With best regards,
> > Alexei Fedotov,
> > Intel Java & XML Engineering
> >
> >> -----Original Message-----
> >> From: Weldon Washburn [mailto:weldonwjw@gmail.com]
> >> Sent: Wednesday, October 25, 2006 5:36 PM
> >> To: harmony-dev@incubator.apache.org; geir@pobox.com
> >> Subject: Re: [classlib][luni] signalis interruptus in hysock
> >>
> >> On 10/24/06, Geir Magnusson Jr. <geir@pobox.com> wrote:
> >>>
> >>>
> >>> Weldon Washburn wrote:
> >>>> It seems JIRA is down for maintenance.  If HARMONY-1904 is still
> > open
> >>>> perhaps it makes sense to put a counter in the while (...) {
> > select...}
> >>>> loop. And after every N loops, print a warning/diagnostic message.
> >>> For whom and to what end?  Why not just return EINTR (in hysock
> > speak)?
> >>>> The
> >>>> value for N would have to be tuned.  I don't know what the best
> > number
> >>>> would
> >>>> be. Given that 1904 patch is not the final solution, at least a
> >>> diagnostic
> >>>> that hints at where the system hangs would be useful.  It might
> > make
> >>> sense
> >>>> to even print a stack trace.   Also, I agree with Ivan below.
> > Signals
> >>> bugs
> >>>> are very hard to debug.  And diagnostics can help us all understand
> > the
> >>>> corner cases better.
> >>> But so far, no one has shown that the system hangs, or can hang,
> > simply
> >>> because we return EINTR....
> >>
> >> Sorry for not being clear.  I was reacting to the patch in 1904 itself.
> >> Not
> >> the bigger issue of fixing the upper layers to comprehend EINTR.  My
> >> understanding is that this patch does not fix the problem.  This patch
> > does
> >> not return EINTR.  If for whatever reason this patch is committed, I
> >> recommend adding the above diagnostic code so that we don't dig
> > ourselves
> >> an
> >> even deeper hole.
> >>
> >> If it is decided 1904 should not be committed, it might make sense to
> >> close it with  "won't fix".
> >>
> >> geir
> >>>> On 10/20/06, Ivan Volosyuk <ivan.volosyuk@gmail.com> wrote:
> >>>>> On 10/20/06, Geir Magnusson Jr. <geir@pobox.com> wrote:
> >>>>>>
> >>>>>> Ivan Volosyuk wrote:
> >>>>>>> Well, I think that the solution is what Geir suggests. One
> > think
> >>>>> which
> >>>>>>> bothers me is following. EINTR can happen in different places
> > and
> >>> the
> >>>>>>> situations can be quite rare in some circumstances. It can
> > lead to
> >>>>>>> hard to reproduce stability bugs (race conditions).
> >>>>>> Can you give an example?
> >>>>> Half a year ago, I was working on the problem. Socket operations
> > get
> >>>>> sometimes interrupted. We have found out that it occurs sometime
> > after
> >>>>> GC. It was not quite easy as the application was quite big and
> >>>>> situation - quite rare.
> >>>>>
> >>>>> Given the fact, that current implementation of monitor reservation
> >>>>> code can stop other thread in quite random fashion we should have
> > rock
> >>>>> solid support of EINTR handling everywhere the select(), poll()
> > calls
> >>>>> is used.
> >>>>>
> >>>>> --
> >>>>> Ivan
> >>>>> Intel Enterprise Solutions Software Division
> >>>>>
> >>>>>>> We should find a
> >>>>>>> way how to test the implementation.
> >>>>>> +1!
> >>>>>>
> >>>>>> :)
> >>>>>>
> >>>>>> geir

Mime
View raw message