river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gregg Wonderly <gr...@wonderly.org>
Subject Re: JavaSpace.notify() "not reliable"
Date Thu, 17 May 2007 18:24:27 GMT
Timur Mehrvarz wrote:
> This is why it is a bit hard for me to understand, how notify() can  be 
> so unreliable. I'd say, I am having one dropped notify() call for  every 
> 50 objects written to space. Roughly.

 From some perspectives, one should be comfortable that on a single machine, 
there shouldn't be any problems.  However, software is "everywhere", and 
software can have "bugs".  The issues are whether you can have faith in the 
software, or time to find every "real" bug.  After working for AT&T and seeing 
the amount of code that existed to just deal with auditing and validating the 
software runtime environment, I changed my whole view of software[1].

> I still need to implement your third suggestion (below). I can do so  
> next week. And I will report back, if this fully fixes the issue for me.
>> With Javaspaces, you should typically request notification, and  then do a read
>> to see if there is an entry that matches the same template, and  then take that
>> entry to process it if so.

Missing 1 in 50 or so sounds like a potential concurrency related bug.  If you 
increase the flow of data does that proportionally increase the missing 
notifies?  If you greatly reduce the flow does that make the notifies more reliable?

Gregg Wonderly

I don't know how many people really know how extreme the measures are which make 
sure that the telephone switching "keeps on switching".  Audits of audits are 
not uncommon.  In the end, if you want your software to work reliably, you have 
to test/validate every assumption/function you rely on.

I try to design audits and validations so that they provide real value, beyond 
recovering from bugs silently.  I always start out with warning level logging in 
legs of code that I don't thing I should enter.  If I see those legs being 
entered, I start adding the collection of stack traces for the places where I 
think those constraints are controlled.  I go through all the effort to try and 
find out more about what is causing the behavior.  Test suites can find a lot of 
these kinds of things early, but in general, there's always going to be more to 
find once you are in the field.

View raw message