incubator-river-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Resendes <Robert.Resen...@Sun.COM>
Subject Re: Mahalo endless loop on prepareAndCommit
Date Mon, 19 May 2008 16:07:43 GMT
Guy Korland wrote:
> Hi,
> 
>  
> 
> We found out that TransactionManager.commit(long timeout) might stack
> forever even if we set a timeout.
> 
>  
> 
> It seems like the following code on TxnManagerTransaction is the source
> of it (line 680):
> 
>  
> 
>   if ((job instanceof PrepareJob) ||
> 
>  
> 
>                       (job instanceof PrepareAndCommitJob)) {
> 
>  
> 
>                         try {
> 
>  
> 
>  Line 680:                                              if
> 
> (job.isCompleted(Long.MAX_VALUE)) {
> 
>  
> 
>                                     result = (Integer)
> job.computeResult();
> 
>  
> 
>  
> 
> Do you have any suggestions?
[Response carried over from 
http://archives.java.sun.com/cgi-bin/wa?A2=ind0805&L=JINI-USERS&T=0&F=&S=&P=3649]

Without more details, I can only guess as to what's happening. One 
possibility is that you have a "misbehaving" transaction participant 
that is taking a long time to respond to the "prepare" part of the 
two-phase commit protocol. If so, then here's what would happen:

- client calls commit() with or without a timeout[1]
- The transaction manager creates a PrepareJob[2] that will "ask" each 
participant whether or not it can complete the transaction
- The transaction manager will wait until the PrepareJob provides an 
answer before continuing (i.e. abort or commit).

The PrepareJob attempt to call "prepare" on each participant. If any of 
those attempts are hanging, then the manager will hang as well (because 
it needs an answer).

Hopefully, this helps, but if it doesn't, then please send more details 
regarding what you are doing and seeing.

You might also consider sending any follow-up messages to the River User 
List (see http://incubator.apache.org/river/RIVER/mailing-lists.html) 
because all the new development effort has shifted (from Sun) to that 
project under the Apache organization.


Notes:

[1] Note the timeout parameter only states how long the client is 
willing to wait for the manager to notify the participants of the 
transaction's outcome (i.e. abort/commit). The timeout only applies 
after the state of the transaction is determined. See 
http://java.sun.com/products/jini/2.0/doc/specs/html/txn-spec.html#7224.

[2] There is also a PrepareAndCommitJob that optimizes for the single 
participant case, but the explanation is similar.

Mime
View raw message