cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexander Daniel (JIRA)" <>
Subject [jira] Commented: (COCOON-1985) AbstractCachingProcessingPipeline locking with IncludeTransformer may hang pipeline
Date Thu, 13 Mar 2008 15:44:24 GMT


Alexander Daniel commented on COCOON-1985:

Two requests can deadlock each other in Cocoon 2.1.11 (without use of parallel with include
* request A: generating lock for 55933
* request B: generating lock for 58840
* request B: waiting for lock 55933 which is hold by request A
* request A: waiting for lock 58840 which is hold by request B

I can reproduce this behaviour with Apache Bench and following pipeline:
* terminal 1: Apache Bench request A (ab -k -n 10000 -c 25 http://localhost:8888/samples/reproduceMultipleThreads/productOfferForDevice/55933/)
* terminal 2: Apache Bench request B (ab -k -n 10000 -c 25 http://localhost:8888/samples/reproduceMultipleThreads/productOfferForDevice/58840/)
* terminal 3: touching the two data files every second to invalidate the cache (while true;
do echo -n "."; touch 55933.xml 58840.xml; sleep 1; done)

* pipeline:
<map:pipeline type="caching">
  <map:match pattern="productOfferForDevice*/*/">
    <map:generate src="cocoon:/exists/{2}.xml" label="a"/>
    <map:transform type="xsltc" src="productOfferIncludeDevice.xsl" label="b">
        <map:parameter name="noInc" value="{1}"/>
    <map:transform type="include" label="c"/>
    <map:serialize type="xml"/>

  <map:match pattern="exists/**">
    <map:act type="resource-exists">
        <map:parameter name="url" value="{1}" />
        <map:generate src="{../1}" />
        <map:serialize type="xml" />
    <!-- not found -->
    <map:generate src="dummy.xml" />
    <map:serialize type="xml" />

After some seconds the deadlock occurs ==>
* Apache Bench requests run into a timeout

* I can see following pipe locks in the default transient store:
(class: org.mortbay.util.ThreadPool$PoolThread)
(class: org.mortbay.util.ThreadPool$PoolThread)
(class: org.mortbay.util.ThreadPool$PoolThread)
(class: org.mortbay.util.ThreadPool$PoolThread)

I added some logging to which reconfirms the explanations
INFO  (2008-03-13) 13:50.16:072 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/55933/)
PoolThread-47/AbstractCachingProcessingPipeline: generating lock PIPELOCK:PK_G-file-cocoon://samples/reproduceMultipleThreads/exists/55933.xml?pipelinehash=-910770960103935149_T-xsltc-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/productOfferIncludeDevice.xsl;noInc=_T-include-I_S-xml-1
INFO  (2008-03-13) 13:50.16:074 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/55933/)
PoolThread-47/AbstractCachingProcessingPipeline: generating lock PIPELOCK:PK_G-file-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/55933.xml
INFO  (2008-03-13) 13:50.16:075 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/58840/)
PoolThread-6/AbstractCachingProcessingPipeline: generating lock PIPELOCK:PK_G-file-cocoon://samples/reproduceMultipleThreads/exists/58840.xml?pipelinehash=-4996088883111986478_T-xsltc-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/productOfferIncludeDevice.xsl;noInc=_T-include-I_S-xml-1
INFO  (2008-03-13) 13:50.16:075 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/58840/)
PoolThread-6/AbstractCachingProcessingPipeline: generating lock PIPELOCK:PK_G-file-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/58840.xml
INFO  (2008-03-13) 13:50.16:281 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/58840/)
PoolThread-6/AbstractCachingProcessingPipeline: waiting for lock PIPELOCK:PK_G-file-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/55933.xml
INFO  (2008-03-13) 13:50.16:304 [sitemap] (/samples/reproduceMultipleThreads/productOfferForDevice/55933/)
PoolThread-47/AbstractCachingProcessingPipeline: waiting for lock PIPELOCK:PK_G-file-file:///Users/alex/dev/cocoon/cocoon-2.1.11/build/webapp/samples/reproduceMultipleThreads/58840.xml

With the attached reproduceMultipleThreads.tar.gz you can reproduce the behaviour yourself:
* download and extract Cocoon 2.1.11
* cd $CocoonHome
* ./
* cd build/webapp/samples
* tar -xzf $DownloadFolder/reproduceMultipleThreads.tar.gz
* cd ../../..
* ./
* open 3 terminals and cd into $CocoonHome/build/webapp/samples/reproduceMultipleThreads in

* dry run without invalidating the cache to see that everything is working:
  - terminal 1: ./ 
  - terminal 2: ./

* run with invalidating the cache every seconds:
  - terminal 1: ./
  - terminal 2: ./
  - terminal 3: ./
* When Apache Bench has run into a timeout you can view the pipelocks with http://localhost:8888/samples/reproduceMultipleThreads/pipelocks

We are currently facing this issue on our production servers.

* removing the pipeline locking code as Ellis suggested?
* making waitForLock fuzzy
* ...

If the pipelock design is the same in Cocoon 2.2 the same deadlock could occur there.

> AbstractCachingProcessingPipeline locking with IncludeTransformer may hang pipeline
> -----------------------------------------------------------------------------------
>                 Key: COCOON-1985
>                 URL:
>             Project: Cocoon
>          Issue Type: Bug
>          Components: * Cocoon Core
>    Affects Versions: 2.1.9, 2.1.10, 2.1.11, 2.2-dev (Current SVN)
>            Reporter: Ellis Pritchard
>            Priority: Critical
>             Fix For: 2.2-dev (Current SVN)
>         Attachments: caching-trials.patch, includer.xsl, patch.txt, sitemap.xmap
> Cocoon 2.1.9 introduced the concept of a lock in AbstractCachingProcessingPipeline, an
optimization to prevent two concurrent requests from generating the same cached content. The
first request adds the pipeline key to the transient cache to 'lock' the cache entry for that
pipeline, subsequent concurrent requests wait for the first request to cache the content (by
Object.lock()ing the pipeline key entry) before proceeding, and can then use the newly cached
> However, this has introduced an incompatibility with the IncludeTransformer: if the inclusions
access the same yet-to-be-cached content as the root pipeline, the whole assembly hangs, since
a lock will be made on a lock already held by the same thread, and which cannot be satisfied.
> e.g.
> i) Root pipeline generates using sub-pipeline cocoon:/foo.xml
> ii) the cocoon:/foo.xml sub-pipeline adds it's pipeline key to the transient store as
a lock.
> iii) subsequently in the root pipeline, the IncludeTransformer is run.
> iv) one of the inclusions also generates with cocoon:/foo.xml, this sub-pipeline locks
in AbstractProcessingPipeline.waitForLock() because the sub-pipeline key is already present.
> v) deadlock.
> I've found a (partial, see below) solution for this: instead of a plain Object being
added to the transient store as the lock object, the Thread.currentThread() is added; when
waitForLock() is called, if the lock object exists, it checks that it is not the same thread
before attempting to lock it; if it is the same thread, then waitForLock() returns success,
which allows generation to proceed. You loose the efficiency of generating the cache only
once in this case, but at least it doesn't hang! With JDK1.5 this can be made neater by using
Thread#holdsLock() instead of adding the thread object itself to the transient store.
> See patch file.
> However, even with this fix, parallel includes (when enabled) may still hang, because
they pass the not-the-same-thread test, but fail because the root pipeline, which holds the
initial lock, cannot complete (and therefore statisfy the lock condition for the parallel
threads), before the threads themselves have completed, which then results in a deadlock again.
> The complete solution is probably to avoid locking if the lock is held by the same top-level
Request, but that requires more knowledge of Cocoon's processing than I (currently) have!
> IMHO unless a complete solution is found to this, then this optimization should be removed
completely, or else made optional by configuration, since it renders the IncludeTransformer

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message