harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Ignatenko" <aleksey.ignate...@gmail.com>
Subject Re: [drlvm] stress.Mix / MegaSpawn threading bug
Date Thu, 11 Jan 2007 08:12:53 GMT
On 1/10/07, Geir Magnusson Jr. <geir@pobox.com> wrote:

>
> On Jan 9, 2007, at 1:14 PM, Rana Dasgupta wrote:
>
> > On 1/9/07, Weldon Washburn <weldonwjw@gmail.com> wrote:
> >>
> >> On 1/9/07, Gregory Shimansky <gshimansky@gmail.com> wrote:
> >> >> I've tried to analyze MegaSpawn test on windows and here's what
> >> I found
> >> >> out.
> >> >>
> >> >> OOME is thrown because process virtual size easily gets up to
> >> 2Gb. This
> >> >> happens at about ~1.5k simultaneously running threads. I think it
> >> >> happens because all of virtual process memory is mapped for thread
> >> stacks.
> >> >>
> >>
> >> >Good job!  I got the same sort of hunch when I looked at the
> >> source code
> >> did
> >> >not have enough time to pin down specifics.  The only guidance I
> >> >found regarding what happens when too many threads are spawned is
> >> the
> >> >following in the java.lang.Thread reference manual, "...specifying a
> >> lower
> >> >[stacksize] value may allow a greater number of threads to exist
> >> >concurrently without throwing an OutOfMemoryError (or other internal
> >> >error)."
> >>
> >> >I think what the above implies is that it is OK for the JVM to
> >> error and
> >> >exit if the app tries to create too many threads.  If this is the
> >> case,
> >> it
> >> >sort of looks like we need to clean up the handling of malloc()
> >> errors so
> >> >that the JVM can exit gracefully.
> >
> >
> > I am not sure that we need to do something about this. The default
> > initial
> > stack size on Windows is 1M,
>
> Yikes!  There's our problem on windows...
>
> > and that is the recommended init size for real
> > applications. The fact that our threads start with a larger intial
> > stack
> > mapped( default ) than RI is a design issue, it is not a bug. We
> > could start
> > with 2K and create many more threads!
>
> That's right.  The fact that the VM crashes and burns is the bug, and
> a serious one, IMO.
>
> > Exactly as Gregory points out,
> > ultimately we will hit virtual memory limits and fail. The reason
> > the RI
> > seems to fail less is that the test ends before running out of virtual
> > memory.On my 32 bit RHEL Linux box, RI fails almost every time with
> > MegaSpawn, with an identical OOME error message and stack dump.
> >
> > We can catch the exception in the test and print a message. But I
> > am not
> > very sure what purpose that would serve. A resource exhaustion
> > exception is
> > a fatal exception and the process is hosed,
>
> No, it's not.
>
> > no real app would be able to do
> > anything more at this point.
>
> That's not true.
>
> > We should not use this test ( which is not a
> > real app ) as guidance to tune the initial stack size. My
> > suggestion is to
> > lower the test duration so that we can create about a 1000( or
> > whatever
> > magic number ) threads at least. That is the stress condition we
> > should test
> > for.
>
> The big thing for me is ensuring that we can drive the VM to the
> limit, and it maintains internal integrity, so applications that are
> designed to gracefully deal with resource exhaustion can do so w/
> confidence that the VM isn't about to crumble out from underneath them.


"VM maintains internal integrity" on OOME situations is a good point. And I
have one very interesting idea how to monitor low memory state in runtime
and change VMs behaviour correspondingly:
the sample code below checks systems memory usage level. So at any moment (
e.g. adding new thread or commiting another portions of Java heap) one can
check that the system memory is almost exhausted. I suppose the number of
such places in drlvm where a lot memory is allocated at once is limited.
void exn_throw_if_exhasted(){
if (port_vmem_usage_rate() > UPPER_MEMORY_BORDER)  {
... throw_exception(OOME);
}
}
This functionality will not garantee that we get 100% OOME working, but
variating UPPER_MEMORY_BORDER value we can get low fail rate for such stress
tests like stress.Mix.

Index: vm/port/include/port_vmem.h
===================================================================
--- vm/port/include/port_vmem.h (revision 495134)
+++ vm/port/include/port_vmem.h (working copy)
@@ -94,7 +94,12 @@
 */
 APR_DECLARE(size_t *) port_vmem_page_sizes();

+/**
+* Returns % of system memory usage.
+*/
+APR_DECLARE(size_t) port_vmem_usage_rate();

+
 #ifdef __cplusplus
 }
 #endif
Index: vm/port/src/vmem/linux/port_vmem.c
===================================================================
--- vm/port/src/vmem/linux/port_vmem.c (revision 495134)
+++ vm/port/src/vmem/linux/port_vmem.c (working copy)
@@ -20,6 +20,7 @@
  */

 #include <sys/mman.h>
+#include <sys/sysinfo.h>
 #include <unistd.h>
 #include <errno.h>
 #include <malloc.h>
@@ -131,6 +132,12 @@
  return page_sizes;
 }

+APR_DECLARE(size_t) port_vmem_usage_rate() {
+    struct sysinfo info;
+    sysinfo(&info);
+    return (info.totalram - info.freeram)*100/info.totalram;
+}
+
 #ifdef __cplusplus
 }
 #endif
Index: vm/port/src/vmem/win/port_vmem.c
===================================================================
--- vm/port/src/vmem/win/port_vmem.c (revision 495134)
+++ vm/port/src/vmem/win/port_vmem.c (working copy)
@@ -215,6 +215,14 @@
  return page_sizes;
 }

+APR_DECLARE(size_t) port_vmem_usage_rate(){
+    MEMORYSTATUSEX ms;
+    ms.dwLength = sizeof (ms);
+    GlobalMemoryStatusEx(&ms);
+
+ return ms.dwMemoryLoad;
+}
+
 #ifdef __cplusplus
 }
 #endif

 What do you think about the idea?

Best regards,
Aleksey.



> geir
>
>
> > a
> > Thanks,
> > Rana
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message