On Jan 9, 2007, at 10:51 AM, Weldon Washburn wrote:
> On 1/9/07, Gregory Shimansky <gshimansky@gmail.com> wrote:
>>
>> Geir Magnusson Jr. wrote:
>> > I started a new thread because I think this is really important.
>> >
>> > I've also added a page in the wiki to track this stuff, because
>> I can't
>> > keep it in my head:
>> >
>> > http://wiki.apache.org/harmony/MegaSpawnThreadingBug
>> >
>> > which you can get to from the home page via the "WhiteBoards"
>> section,
>> > intended to be a place where we can work as a team on a
>> whiteboard, with
>> > the intention that once the mini-project is over, we erase...
>
>
> This is a good idea. I still want to put some of the discussion on
> email so
> that we have a permanent record of our investigations. I have some
> thoughts
> inlined below.
>
>>
>> > I think this is a scary scary problem :)
>>
>> I've tried to analyze MegaSpawn test on windows and here's what I
>> found
>> out.
>>
>> OOME is thrown because process virtual size easily gets up to 2Gb.
>> This
>> happens at about ~1.5k simultaneously running threads. I think it
>> happens because all of virtual process memory is mapped for thread
>> stacks.
>>
>> When virtual memory is exhausted all kind of problems may occur.
>> In many
>> places there are assertions that malloc returns non-NULL, and these
>> assertions fail. In some places there are no checks for malloc,
>> and NULL
>> pointer is used for addressing, this also crashes VM.
>
>
> Good job! I got the same sort of hunch when I looked at the source
> code did
> not have enough time to pin down specifics. The only guidance I
> found regarding what happens when too many threads are spawned is the
> following in the java.lang.Thread reference manual, "...specifying
> a lower
> [stacksize] value may allow a greater number of threads to exist
> concurrently without throwing an OutOfMemoryError (or other internal
> error)."
>
> I think what the above implies is that it is OK for the JVM to
> error and
> exit if the app tries to create too many threads. If this is the
> case, it
> sort of looks like we need to clean up the handling of malloc()
> errors so
> that the JVM can exit gracefully.
Well - I think that we should strive to maintain an internally
consistent VM, throw an OOM, and let the app decide. There are
situations where with a solid VM, you can deal w/ the OOM at the app
level.
>
> Another approach would be to throw something like a,
> "TooManyThreadsAtOnceException" and keep running the app. I can't
> find
> anything like this kind of exception. Its probably not an option.
No :)
>
> Another approach would be to make Thread.start() method wait until
> there are
> enough resources to create a new thread. Most likely the app would
> hang
> mysteriously without warning. This is probably not an option either.
Nope :)
>
> Another item we need to discuss is what are the Q1/Q2 goals for max
> number
> of threads supported? It seems we can do lots of useful stuff with
> a max of
> 1500 threads. The useful stuff being items like the bringup of
> enterprise
> apps, fixing stability problems...
I don't mind a suboptimal # of concurrent threads - we can work on
that over time. The fact that the VM falls over dead scares the
bejeezus out of me.
geir
>
>
> I tried to watch Sun implementation and it looks like they map smaller
>> amounts of memory for thread stacks. Maybe they map only initial
>> stack
>> memory somehow and allow it to grow later (although I don't quite
>> understand how it is possible in continuous address space). When
>> Sun VM
>> executes this test it created up to ~6k simultaneously running
>> threads
>> and process size at the same moment was smaller than 2Gb.
>>
>> I think the same problem may happen on Linux because it spills out
>> OOMEs
>> on Ubuntu as well.
>>
>> If somehow test doesn't crash on failed mallocs and gets to the
>> shutdown
>> stage and hangs with 2 or more dead locked threads. So far I didn't
>> quite understand how they lock each other.
>>
>> --
>> Gregory
>>
>>
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
|