river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Brouwer <mark.brou...@cheiron.org>
Subject [BUG] NPE in com.sun.jini.jeri.internal.runtime.SelectionManager$SelectLoop
Date Thu, 03 May 2007 23:27:31 GMT
Recently I encountered an NPE similar to one I reported more than 2
years ago

and that has been fixed in the meantime, attached you will find the

The test were done on a network with 5 dual 2.7G Xeon servers
and all clients and services were running on Seven with NIO enabled
for the Tcp(Server)Endpoints and all services shared the same

One server was hosting a JavaSpace while the other servers represented
clients. In total 12 clients were trying each from 100 threads to write
a ~ 15KBytes Entry into the space and at the same time each service
tried to obtain these entries from another 100 threads. All bandwidth
was consumed and it was just a matter of time for the JavaSpace to
crash, OutOfMemoryErrors were showing up in the log in the muxer for the
container hosting the JavaSpace just before the JVM passed away.

No problem one would think, although I also noticed that 2 out 3 JVMs in
which 8 out 12 'clients' were running seem to be dead as well although
their JVMs were still there, the discovery protocols were doing fine as
well as the internal JAR file server.

Looking into the log files for these 2 containers I noticed each request
resulted into the exceptions as can be seen in the attachment. The log
files were rather large and I couldn't find another type of cause that
quickly and due to some time stupid action of me I forget to set them
apart in a place that wasn't going to be wiped out.

Some environment data:

Linux x.y.z 2.6.9-42.ELsmp #1 SMP Sat Aug 12 09:39:11 CDT 2006 i686 i686
i386 GNU/Linux

java version "1.5.0_11"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_11-b03)
Java HotSpot(TM) Server VM (build 1.5.0_11-b03, mixed mode)

I analyzed the code and I'm flabbergasted by this NPE (for the record, I
made no modifications to the code) and so far I can only come to the
conclusion either the memory-model for the Sun JVM 1.5.0_11 for Linux is
broken or the fix done for Porter was not 100% perfect although I
haven't seen the NPE for a long time.

I don't have the log files for close inspection but given the fact 2 out
of 3 JVMs had the some behavior and showed the same stacktraces there
must have been something strange going on beyond flipping a random bit.
Unfortunately I don't have access any longer to the environment to take
some time to investigate further.

Might this also indicate we should give 
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4857137 some priority 
... Also I have no clue how many people enable NIO for Jini ERI. I've 
done it since the beginning and have been bitten a few time by it. But 
maybe we should enable it by default so more people can do the proper 
field testing. If we don't make it the default it will always stay 
suspicious to some extend.

View raw message