jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dominique Pfister" <dominique.pfis...@day.com>
Subject Re: Server Hang in a Cluster, might be a deadlock
Date Tue, 15 May 2007 07:50:19 GMT
Another question, that just crossed my mind: after having restarted
your stalled cluster node, does it again behave normally?


On 5/15/07, Ian Boston <ieb@tfd.co.uk> wrote:
> Hi,
> I've been doing some testing of a 2 node jackrabbit cluster using 1.3
> (with the JCR-915 patch), but I am getting some strange behavior.
> I use OSX Finder to mount a DAV service from each node and then upload
> lots of files to each dav mount at the same time. All goes Ok for the
> first few 1000 files, and then one of the nodes stops responding to that
> session. The other node continues and finishes.
> Eventually OSX disconnects the stalled node.
> When I try the port of the apparently stalled cluster node, its still
> responds, however with some strange behaviour.
> A remount attempt responds with a 401 and forces basic login, but stalls
> after that point. (the URL is to the base of a workspace)
> If I open firefox and access the dav servlet via firefox, I can navigate
> down the directory tree, but if I try and refresh any jcr folder or jcr
> file that I have already visited (since the cluster node has been up),
> FF spins forever.
> I have put a Deadlock detector class into both nodes (java class that
> looks for deadlock through jmx) but it doesnt detect anything.
> I have also use JProfiler connected to one node but it never detects a
> deadlock.
> I have tried all of this in single node mode, with no Journal or
> ClusterNode and not been able to re-create the problem (yet).
> The one thing that I have seen in JProfiler is threads blocked waiting
> for an ItemState? monitor inside jackrabbit, but never for more that 500ms.
> I am using the standard DatabaseJournal and the
> SimpleDbPersistanceManager, however I see the same happening with the
> FileJournal.
> Any ideas ? I might put some very simple debug in near that monitor that
> was blocking for 500ms ?
> I did search JIRA but couldnt find anything that was a close match.
> Ian

View raw message