river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Trasuk <tras...@stratuscom.com>
Subject Re: Bug fixing
Date Tue, 26 Oct 2010 13:34:23 GMT
Hi Patricia:

See comments interspersed...

On Tue, 2010-10-26 at 00:54, Patricia Shanahan wrote:
> On 10/25/2010 10:50 AM, Greg Trasuk wrote:
> >
> > On Mon, 2010-10-25 at 12:15, Patricia Shanahan wrote:
> >> I've made some progress on one form of failure, a resource remaining
> >> available after its lease was scheduled to expire.
> >>
> >> If the test pauses, e.g due to an inserted Thread.sleep, between
> >> creating the lease and probing it for the first time, the test passes.
> >> The effect of either of these actions is to delay the probe until after
> >> the lease has expired. Incidentally, hooking up to the Eclipse debugger
> >> was very helpful in discovering this behavior. The first indication was
> >> when I spent a few minutes investigating at a breakpoint, let it
> >> continue, and the test passed.
> >>
> >> If the test gets to the probe point before the lease is due to expire,
> >> it goes into a polling loop. No matter how long it stays in the polling
> >> loop, the lease does not expire. Somehow, the probe loop has the effect
> >> of preventing expiration.
> >>
> > What is the "probe code" ?  How exactly does it determine that the lease
> > is still alive?
> 
> It calls this method:
> 
>      protected boolean isAvailable() throws TestException {
>          try {
>              final Lease x = space.write(aEntry, resource, 1000);
> 			//addOutriggerLease(x, true);
>          } catch (TransactionException e) {
> 
>              // This is ok...probably means it is a bad transactions
>              return false;
>          } catch (Exception e) {
>              throw new TestException("Testing for availability", e);
>          }
>          return true;
>      }
> 
> (I've slightly simplified the code, as part of a general attempt to 
> comment out anything that is not required to reproduce the problem.)
> 
> space is a JavaSpace. The objective seems to be to detect whether the 
> Transaction, resource, remains available. It was originally requested 
> with a one minute lease duration, and it seems to have got it, "aprox 
> duration:59991"
> 

OK, so the JavaSpace will attempt to join the transaction and that join
should fail if the transaction's lease has expired, or if the
transaction has been aborted, or is in the commit process (i.e. not in
ACTIVE state).

> If the isAvailable method is called even once before the scheduled 
> expiration of the resource lease, it will go on returning true. Repeated 
> calls are not needed - a single early call is sufficient. I've tested 
> that through a half hour sleep between two calls.
> 
> If the first call to this method is after the scheduled expiration of 
> the resource lease, it returns false.
> 
> The JavaSpace is implemented by OutriggerServerImpl. This print call in 
> its public long[] write(EntryRep rep, Transaction tr, long lease) method:
> 
> System.err.printf("XXXpats: Write of rep %s, tr %s, lease %d at %d%n",
> 			rep, tr, lease, System.currentTimeMillis());
> 
> reports:
> 
> XXXpats: Write of rep 
> EntryRep[com.sun.jini.test.share.UninterestingEntry], tr 
> net.jini.core.transaction.server.ServerTransaction 
> [manager=com.sun.jini.mahalo.TxnMgrProxy$ConstrainableTxnMgrProxy@2b6ad6ff, 
> id=-6728796543664280839], lease 1000 at 1288065446700
> 
> 
> 
> >
> >> I would like to understand the rules for lease expiration. What actions
> >> should extend a lease?
> >
> > lease.renew(long duration);
> >
> >
> >> Is the lease manager required to expire it on
> >> schedule? (Even if it is not required to expire the lease I still need
> >> to find out whether the non-expiration is deliberate design or due to a
> >> bug.)
> >
> > Do you mean the landlord (i.e. the grantor of the lease), or the
> > LeaseRenewalManager object (which automatically renews leases on behalf
> > of a lessor)?
> >
> > The owner of the resource does not need to expire it on a time schedule;
> > it could expire the lease at some future point that is a convenient run
> > time (like the next time the service happens to be accessed).
> > Conceptually, the lease is entirely for the convenience of the lessor;
> > expiry means that the client is no longer interested in the resource, so
> > the client's resource allocation can be freed.
> >
> > In any case, specs would be at:
> >
> > http://www.jini.org/wiki/Jini_Distributed_Leasing_Specification
> 
> The bottom line seems to me to be that the observed behavior may be 
> permitted, but is weird enough that I need to investigate it to find out 
> if it is an indication of a bug.
> 

The transaction spec says (under "Joining a Transaction") that "The join
method throws CannotJoinException if the transaction is known to the
manager but is no longer active."

So it would appear that this behaviour is a bug, if the transaction's
lease has actually expired.

I had a quick look through TxnManagerImpl and I don't see any
"auto-renew-on-join" behaviour, so that seems kind of odd.  In fact, in
the join(...) method, it finds a TxnManagerTransaction instance, then
calls join(..) on it, which appears to check whether the transaction is
expired, and should throw CannotJoinException if it is expired.

I'm not currently setup to run the tests, but if I were, I'd set the log
level on TxnManagerImpl to FINER and see if somebody is renewing the
lease on that transaction, then try to track down who it is.


Cheers,

Greg.


> A single write call using the transaction protects it for at least half 
> an hour. That is just the longest time I've tested - it may protect it 
> indefinitely, which would be a bug. There are no leases involved longer 
> than one minute.
> 
> Patricia
-- 
Greg Trasuk, President
StratusCom Manufacturing Systems Inc. - We use information technology to
solve business problems on your plant floor.
http://stratuscom.com


Mime
View raw message