harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Weldon Washburn" <weldon...@gmail.com>
Subject Re: [drlvm] stress.Mix / MegaSpawn threading bug
Date Tue, 09 Jan 2007 15:51:52 GMT
On 1/9/07, Gregory Shimansky <gshimansky@gmail.com> wrote:
>
> Geir Magnusson Jr. wrote:
> > I started a new thread because I think this is really important.
> >
> > I've also added a page in the wiki to track this stuff, because I can't
> > keep it in my head:
> >
> >   http://wiki.apache.org/harmony/MegaSpawnThreadingBug
> >
> > which you can get to from the home page via the "WhiteBoards" section,
> > intended to be a place where we can work as a team on a whiteboard, with
> > the intention that once the mini-project is over, we erase...


This is a good idea.  I still want to put some of the discussion on email so
that we have a permanent record of our investigations.  I have some thoughts
inlined below.

>
> > I think this is a scary scary problem :)
>
> I've tried to analyze MegaSpawn test on windows and here's what I found
> out.
>
> OOME is thrown because process virtual size easily gets up to 2Gb. This
> happens at about ~1.5k simultaneously running threads. I think it
> happens because all of virtual process memory is mapped for thread stacks.
>
> When virtual memory is exhausted all kind of problems may occur. In many
> places there are assertions that malloc returns non-NULL, and these
> assertions fail. In some places there are no checks for malloc, and NULL
> pointer is used for addressing, this also crashes VM.


Good job!  I got the same sort of hunch when I looked at the source code did
not have enough time to pin down specifics.  The only guidance I
found regarding what happens when too many threads are spawned is the
following in the java.lang.Thread reference manual, "...specifying a lower
[stacksize] value may allow a greater number of threads to exist
concurrently without throwing an OutOfMemoryError (or other internal
error)."

I think what the above implies is that it is OK for the JVM to error and
exit if the app tries to create too many threads.  If this is the case, it
sort of looks like we need to clean up the handling of malloc() errors so
that the JVM can exit gracefully.

Another approach would be to throw something like a,
"TooManyThreadsAtOnceException" and keep running the app.  I can't find
anything like this kind of exception.  Its probably not an option.

Another approach would be to make Thread.start() method wait until there are
enough resources to create a new thread.  Most likely the app would hang
mysteriously without warning.  This is probably not an option either.

Another item we need to discuss is what are the Q1/Q2 goals for max number
of threads supported?  It seems we can do lots of useful stuff with a max of
1500 threads.  The useful stuff being items like the bringup of enterprise
apps, fixing stability problems...


I tried to watch Sun implementation and it looks like they map smaller
> amounts of memory for thread stacks. Maybe they map only initial stack
> memory somehow and allow it to grow later (although I don't quite
> understand how it is possible in continuous address space). When Sun VM
> executes this test it created up to ~6k simultaneously running threads
> and process size at the same moment was smaller than 2Gb.
>
> I think the same problem may happen on Linux because it spills out OOMEs
> on Ubuntu as well.
>
> If somehow test doesn't crash on failed mallocs and gets to the shutdown
> stage and hangs with 2 or more dead locked threads. So far I didn't
> quite understand how they lock each other.
>
> --
> Gregory
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message