jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Giota Karadimitriou" <Giota.Karadimitr...@eurodyn.com>
Subject RE: jackrabbit & clustering
Date Tue, 06 Jun 2006 18:45:49 GMT
Right, this is the point I was also trying to make... that the
serializable state being transferred from cluster to cluster cannot have
listeners so I should just copy and notify.

My concerns now are read locks and connections. Read locks are difficult
to be enforced because 1) they are called in many parts of the code and
if I make so many distributed calls to acquire locks on the other
cluster node(s) this makes them expensive in terms of performance and 2)
if I try to acquire a lock on a node that is not yet up system becomes
somehow unstable afterwards.

My second concern are connections because the first successful test of
this clustering scenario I have been discussing so long ended up in
failure because of java.sql.SQLException: Timed out waiting for an
available connection, which means something I did is wrong and
connections are not being closed properly.

Actually to make a summary, what I do is the following 
Acquire Distributed write lock
share states
release distributed write lock.

To do this 

1) I replaced all the rwLock.writeLock().acquire(); 
with a acquireDistributedLock() method that acquires write locks on our
2 cluster nodes always with the same order.
2) inside Update.end() and before the end, right after /* notify virtual
providers about node references */ and before releasing write lock I
call shareStates() method

public void end() throws ItemStateException {
...
/* notify virtual providers about node references */
for (int i = 0; i < virtualNodeReferences.length; i++) {
     List virtualRefs = virtualNodeReferences[i];
     if (virtualRefs != null) {
         for (Iterator iter = virtualRefs.iterator(); iter.hasNext();) {
              NodeReferences refs = (NodeReferences) iter.next();
              virtualProviders[i].setNodeReferences(refs);
         }
     }
}

!!!!!HERE
try {
    if (sismRemote==null) sismRemote=getRemoteSharedManager();
    if (sismRemote!=null) sismRemote.updateStates(cloneStates());
 } catch (Exception e){
          log.error(e);
 }
 // downgrade to read lock
 acquireReadLock();
 log.debug("before releasing write lock");
 releaseDistributedWriteLock() ;
 holdingWriteLock = false; 
...}

3) I call a releaseDistributedWriteLock() like visible above.

4) The update states method (called on the shism of the other cluster
node)
is in effect the following and what it does is: in the case the states
are cached to remove them (so that they can be retrieved again fresh
from the database) and if they are transient/local to copy and notify of
the changes

public void updateStates(List[] states) throws NoSuchItemStateException,
       ItemStateException {
	//created states
            Iterator addedIt=states[0].iterator();
            while (addedIt.hasNext()){
                ItemState aState=(ItemState)addedIt.next();
                log.debug("remotely adding state"+aState.getId());
                stateCreated(aState);
            }
    //modified states
            Iterator modifiedIt=states[1].iterator();
            while (modifiedIt.hasNext()){
                ItemState remote=(ItemState)modifiedIt.next();
                ItemId iid=remote.getId();
                log.debug("remotely modifying state:"+iid);
                if (hasItemState(iid)){
                    log.debug("has item state:"+iid);
                    if (hasNonVirtualItemState(iid)){
                        log.debug("has non virtual item state:"+iid);
                        if (cache.isCached(iid)) {
                            log.debug("is cached:"+iid);
                            cache.evict(iid);
                        }
                    } else {
                        //virtual or transient
                        log.debug("virtual or transient:"+iid);
                        ItemState transState=getItemState(iid);
                        //remote.connect(transState);
                        //remote.push();
                        transState.copy(remote);
                        transState.notifyStateUpdated();
                    }
                }
            }
//deleted states
            Iterator deletedIt=states[2].iterator();
            while (deletedIt.hasNext()){
                ItemState remote=(ItemState)deletedIt.next();
                ItemId iid=remote.getId();
                log.debug("remotely deleting state:"+iid);
                if (hasItemState(iid)){
                    log.debug("has item state:"+iid);
                    if (hasNonVirtualItemState(iid)){
                        log.debug("has non virtual item state:"+iid);
                       if (cache.isCached(iid)) {
                           log.debug("is cached:"+iid);
                           stateDestroyed((ItemState)addedIt.next());
                       }
                    } else {
                        //virtual or transient
                        log.debug("dvirtual or transient:"+iid);
                        ItemState transState=getItemState(iid);
                        transState.notifyStateDestroyed();
                    }
                }
            }



        }

This is all. Any feedback/comments on what I wrote (which is what I'm
also trying to do) would be greatly appreciated. Thanks

Giota

> -----Original Message-----
> From: Marcel Reutegger [mailto:marcel.reutegger@gmx.net]
> Sent: Tuesday, June 06, 2006 11:12 AM
> To: dev@jackrabbit.apache.org
> Subject: Re: jackrabbit & clustering
> 
> Hi Giota,
> 
> Giota Karadimitriou wrote:
> > Hello,
> >
> > I have finally put the scenario in action and so far I have
encountered
> > the following problems.
> >
> > Regarding the actual scenario the problem I came across was in these
> > much discussed 2 lines of code :
> >
> > //modifiedIt comes from shism1
> > while (modifiedIt.hasNext()){
> >                 ItemState is1=(ItemState)modifiedIt.next();
> >                 ItemId iid=remote.getId();
> >                 log.debug("remotely modifying state:"+iid);
> >                 if (hasItemState(iid)){
> >                     log.debug("has item state:"+iid);
> >                     if (hasNonVirtualItemState(iid)){
> >                         log.debug("has non virtual item
state:"+iid);
> >                         if (cache.isCached(iid)) {
> >                             log.debug("is cached:"+iid);
> >                             cache.evict(iid);
> >                         }
> >                     } else {
> >                         //virtual or transient
> >                         log.debug("virtual or transient:"+iid);
> >                         ItemState is2=getItemState(iid);
> >                         is1.connect(transState);              //HERE
> >                         is1.push();                           //HERE
> >                         is2.notifyStateUpdated();            //NULL
> > POINTER
> >                     }
> >
> > the problem is actually the following:
> > after connect, is1 becomes the listener for is2
> > and push() copies information from is1 to is2 but when
is2.notifyUpdated
> > is invoked I get a null pointer exception because is1 has no
listeners.
> > It is a copy (or even if I pass the actual state and not a copy it
is a
> > serializable state passed from cluster to cluster without listeners
> > attached to it any more) thus the null pointer.
> >
> > Maybe I should just do
> >
> > is2.copy(is1);
> > is2.notifyUpdated();
> >
> > ?
> 
> Looking at the implementation of the push() method which is basically
a
> copy(), this should also work.
> 
> The reason for the NullPointerException is the missing listener
> collection in ItemState, which is declared as transient. That means,
if
> you de-serialize an ItemState it kind of become invalid because of the
> missing listener collection. I'm not sure if this is a 'bug'. But
since
> you are not interested in having listeners on that item anyway, you
> should be fine.
> 
> > I actually transfer the states themselves and not a copy because it
is
> > difficult to make a copy of the state (no suitable constructor or
clone
> > method etc). Besides states are Serializable objects and can be
moved
> > from cluster to cluster using RMI. You think this might create a
> > problem?
> 
> well, the basic problem you already encountered: what happens to
> listeners of an ItemState. Because listeners are not serializable,
this
> information gets lost. Which I think is ok, but you just have to be
> aware of it...
> 
> > Finally regarding locking I implemented some write locking mechanism
> > following your suggestion to always follow the same order in order
to
> > avoid deadlock situations and it works fine so far. For read locks I
> > could not do the same because read locks are acquired even on
startup
> > and the problem is the following: first cluster node is e.g.
initialized
> > then distributed read lock tries to be enforced but second cluster
is
> > not yet up and thus it fails with a connection refused exception and
RAR
> > cannot be deployed successfully on both clusters.
> 
> I think a node should only acquire locks on other nodes that are
> actually part of the running cluster. If a node is not in the game you
> simply don't ask for locks on that node.
> 
> regards
>   marcel


Mime
View raw message