incubator-river-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Dolan (JIRA)" <j...@apache.org>
Subject [jira] Created: (RIVER-337) Attempted discard of unknown registrar kills LookupLocatorDiscovery thread
Date Tue, 20 Apr 2010 16:04:50 GMT
Attempted discard of unknown registrar kills LookupLocatorDiscovery thread
--------------------------------------------------------------------------

                 Key: RIVER-337
                 URL: https://issues.apache.org/jira/browse/RIVER-337
             Project: River
          Issue Type: Bug
          Components: net_jini_discovery, net_jini_lookup
    Affects Versions: AR1, jtsk_2.1
            Reporter: Chris Dolan


The method
   net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(DiscoveryEvent)
has the following code that throws a RuntimeException (the code comment suggests that it is
supposed to be impossible, but it's not).

        ProxyReg reg = findReg(proxys[i]);
        if(reg != null ) { // this check can be removed.
            proxyRegSet.remove(proxyRegSet.indexOf(reg));
            drops.add(reg);
        } else {
            throw new RuntimeException("discard error");
        }//endif

Our QA does failover testing with two servers, each with a Reggie, where we deliberately crash
and reboot server 1 then server 2 every 30 minutes continuously.  In one case, we hit that
RuntimeException.  I don't know why we got a null reg (that's a problem for another defect,
maybe an undiagnosed race of two discards put on a task queue?  Maybe related to RIVER-37?).
 But it caused a catastrophic chain of events because the RuntimeException is not caught anywhere
up the stack.  In our case, it killed the LookupLocatorDiscovery$Notifier thread.

java.lang.RuntimeException: discard error
	at net.jini.lookup.ServiceDiscoveryManager$DiscMgrListener.discarded(2639)
	at net.jini.discovery.LookupDiscoveryManager.notifyListener(1375)
	at net.jini.discovery.LookupDiscoveryManager.notifyListener(1356)
	at net.jini.discovery.LookupDiscoveryManager.access$500(92)
	at net.jini.discovery.LookupDiscoveryManager$LocatorDiscoveryListener.discarded(543)
	at net.jini.discovery.LookupLocatorDiscovery$Notifier.run(650)

I propose three changes:

  1) change the discarded() method above to simply warn instead of throwing
  2) put a try/catch(Throwable) around the listener invocation in 
     LookupLocatorDiscovery$Notifier.run()
  3) put a similar try/catch around listener invocation in LookupDiscoveryManager.notifyListener

The idea behind #2 and #3 is that misbehaving listeners should not be allowed to derail the
discovery process.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message