harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weldon Washburn" <weldon...@gmail.com>
Subject Re: [drlvm][threading] taking a closer look at Harmony-2803 (stress.Mix hangs on rhel4)
Date Fri, 05 Jan 2007 23:31:05 GMT
On 1/5/07, Rana Dasgupta <rdasgupt@gmail.com> wrote:
>
> Weldon,
>    What I reported earlier were the results with stress.Mix with
> selectThreadType returning "spawn" only.
>    With the MegaSpawn test:
>
>   On a 2 cpu XP box:
>    - RI passes
>   - DRLVM fails with OOME when creating new threads and hangs
>
> On a 2 cpu RHEL4 box
> - DRLVM fails with OOME when creating new threads and hangs
> -  RI also occassionally fails with OOME when creating new threads, but
> sometimes passes.
>
>
> OOME is a resource exhaustion error, and it seems that it can happen with
> both RI and DRLVM if we do uncontrolled creation. Do you think that we
> should treat it as a bug?


Good question.  The first step is to create a test case that always triggers
the bug and is as few lines of code as possible.  From both Naveen and your
feedback, it looks like MegaSpawn.java fills the bill.  The next step is to
decide if bugs uncovered by MegaSpawn are worth chasing.  I am inclined to
open a bug report and set the urgency low.  And simultaneously modify
stress.Mix so that it does not spawn zillions of threads.  The intention is
to make stress.Mix a regression test that passes today on 1, 2 and 4-way
boxes  If, at some future point, we see an important workload generate gobs
of threads and crash the same way as MegaSpawn, we can bump up the priority
of this bug.


 Maybe we should lower the thresholds to let the
> test pass.
>
> Rana
>
>
>
> On 1/5/07, Rana Dasgupta <rdasgupt@gmail.com> wrote:
> >
> > HI Weldon,
> >    On a 2 cpu smp RHEL4 box:
> >      - on RI the modified test always passes
> >      - DRLVM always fails with OOME when trying to spawn threads
> >
> >   On a 2 cpu XP box:
> >     - RI kills the "occupy" thread( I thought this is really smart ),
> but
> > passes
> >    - DRLVM sometimes passes, sometimes hangs, occassionally reports
> > invalid memory references, VM launcher errors etc. etc.
> >
> >
> > Rana
> >
> >  On 1/5/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> > >
> > > All,
> > >
> > > An update.  I think I can cause the bug(s) we are discussing in this
> > > email
> > > chain to surface 100% of the time on 2 cpu rhel4.  Hacking
> stress.Mixsuch
> > > that it only runs the "spawn" method seems to do it.   This hack
> always
> > > fails on 2 cpu rhel4 and never fails on 1 cpu winxp.  It would be
> > > interesting to try on 1 cpu rhel4 and 2 cpu winxp.  If anyone has this
> > > combination, please try it.  This would tell us if the bug is
> sensitive
> > > to
> > > OS.
> > >
> > > When I run the below patch on RI, it completes successfully every
> time.
> > > Curiously it looks like the RI kills the "occupy" thread.  It prints
> on
> > > the
> > > console output, "occupy terminated by java.lang.OutOfMemoryError..
> .".  But
> > > I
> > > don't see occupy terminate when running drlvm.  I will run more
> > > tests.  This
> > > may actually be compounding the problems we are seeing.
> > >
> > > Naveen, Rana,
> > > Can you try the below patch on your hardware to see if you can
> reproduce
> > > what I describe above?  Does the output look the same as Harmony-2803?
> > >
> > >
> > > Below is an svn diff that makes the hard failure happen on 2 cpu
> rhel4:
> > >
> > > Index: Mix.java
> > > ===================================================================
> > > --- Mix.java    (revision 491852)
> > > +++ Mix.java    (working copy)
> > > @@ -93,6 +93,8 @@
> > >
> > >     static Random random = new Random(0);
> > >     static String selectThreadType(int i) {
> > > +               return "spawn";
> > > +               /*
> > >         switch (i % 9) {
> > >             case 0: return "uncontended";
> > >             case 1: return "contended";
> > > @@ -105,6 +107,7 @@
> > >             case 8: return "exceptions";
> > >         }
> > >         return "nothing";
> > > +               */
> > >     }
> > >
> > >     static int thread_number = 60;
> > >
> > >
> > >
> > > On 1/4/07, Naveen Neelakantam < neelakan@uiuc.edu> wrote:
> > > >
> > > >
> > > > On Jan 4, 2007, at 4:28 PM, Weldon Washburn wrote:
> > > >
> > > > > I see it hang consistently when running automated mode (build
> > > > > test).  I have
> > > > > seen it hang once when running manually from a linux terminal
> > > > > window.  It
> > > > > actually printed out "PASSED" then hung.  This leads me to suspect
> > > > > there
> > > > > might be problems with how System.out.flush() is working when
> there
> > > > > are
> > > > > multiple threads running on SMP.  Are you running on an SMP box?
> > > > > Can you
> > > > > give me the exact command line you are using?  I would like to try
> > > > > it on my
> > > > > box.
> > > >
> > > > Ok, cool.  I was seeing the exact same behavior (i.e. the test
> prints
> > > > PASSED and then hangs).  So it sounds like you are on the right
> > > > track, to me anyway.
> > > >
> > > > But to answer your questions: I am running RHEL4 update 4 on a core2
> > > > duo.  The command line I am using is "java -cp . stress.Mix" (with
> my
> > > > path and JAVA_HOME set appropriately).
> > > >
> > > > Naveen
> > > >
> > >
> > >
> > >
> > > --
> > > Weldon Washburn
> > > Intel Enterprise Solutions Software Division
> > >
> > >
> >
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message