cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geert Josten <Geert.Jos...@daidalos.nl>
Subject Re: Cinclude transforms very *slow* [ was Re: Speeding up an xinclude transform?]
Date Mon, 09 May 2005 14:01:36 GMT
About the other hint I made:

the cinclude transformer supports the cached-include element. Try doing something like:

<cinclude:cached-include src="http://server/document1.xml"/>

I'm not sure whether this element is actually in a different namespace or not, the following
page 
that describes this is inconclusive:

http://cocoon.apache.org/2.1/userdocs/transformers/cinclude-transformer.html

HTH!

Cheers

Derek Hohls wrote:

> Geert
> 
> Yes, I have tried splitting things (using cinclude and not xinclude,
> but the concept is the same) - the part that still takes the longest
> is having to run the cinclude/xinclude step ... regardless of in which
> "match" step it occurs...
> 
> Derek
> 
> 
>>>>Geert.Josten@daidalos.nl 2005/05/09 02:52:59 PM >>>
> 
> If you fragment your pipes into different matches that get called
> 'after each other', Cocoon might 
> be able to cache the results in between. I think that is what made my
> sitemap quick on subsequent calls.
> 
> Something like:
> 
> <map:match pattern="get/*.xml">
>    <map:generate type="file" src="{global:data-dir}/{1}.xml" />
>    <map:serialize type="xml" />
> </map:match>
> 
> <map:match pattern="merge/all.xml">
>    <map:generate type="directory" src="{global:data-dir}" />
>    <map:transform src="dir2include.xsl" />
>    <map:transform type="xinclude" />
>    <map:serialize type="xml" />
> </map:match>
> 
> <map:match pattern="process/all.xml">
>    <map:generate type="file" src="cocoon://merge/all.xml" />
>    <map:transform src="process.xsl" />
>    <map:serialize type="xml" />
> </map:match>
> 
> <map:match pattern="report/all.hml">
>    <map:generate type="file" src="cocoon://process/all.xml" />
>    <map:transform src="report.xsl" />
>    <map:serialize type="html" />
> </map:match>
> 
> By the way, I might have been using plain old XSL document() function
> to merge the lot. I know for 
> sure it is cached, but I believe a bit too eagier. Some people on this
> list refered to some bugzilla 
> report.
> 
> Oh, I somewhere recall that cinclude supported a cached-cinclude
> element as well. Or am I mistaken?
> 
> Cheers,
> Geert
> Derek Hohls wrote:
> 
> 
>>Geert
>>
>>Thanks for the suggestion.  The XML files are quite simple: the
>>inml:ind is the first tag and the meta is the first nested tag (only
>>one occurrence).  I did try and drop the // from the call, without
>>any noticable effect - but I did not add in the [n] values.  I can
> 
> try
> 
>>this too, tho', if it will make a difference...
>>
>>I have also not seen any performance difference between first
>>and subsequent calls (using cinlicde or xinclude) - I assume 
>>because neither is cached?
>>
>>Thanks
>>Derek
>>
>>
>>
>>>>>Geert.Josten@daidalos.nl 2005/05/09 02:16:02 PM >>>
>>
>>Hi there,
>>
>>I saw someone make a remark on the xpointer/xpath expression, but
> 
> never
> 
>>saw a response to it. Have 
>>you tried changing the // in //inml:ind/meta to a precise path? Or
> 
> is
> 
>>that impossible? And how ofter 
>>does that element occur? Would it be sufficient to stop after the
> 
> first
> 
>>occurrence of inml:ind or 
>>meta? You might want to use something like: 
>>/root-elem[1]/sub-elem1[1]/sub2[3]/sub3[2]/inml:ind[meta][1]/meta[1]
>>
>>I can't oversee whether your problem has to do with caching or not.
> 
> But
> 
>>I've been processing 1500+ 
>>files from filesystem using Cocoon, filtering, selecting, merging
> 
> and
> 
>>converting them all together. 
>>And though the first call is slow, subsequest requests are fast.
>>
>>I know for sure that // is a performance killer, so I think it is
> 
> worth
> 
>>investigating..
>>
>>Cheers,
>>Geert
>>
>>Derek Hohls wrote:
>>
>>
>>
>>>Chris
>>>
>>>Yes... the CInclude transformer output isn't cached, and I have
>>>found that the approach you suggested is only marginally faster
>>>than using the xinclude approach I had adopted originally.
>>>
>>>I guess that working with information from numerous XML files
>>>on disc is not really a viable operation using Cocoon... which is,
>>>I think, a pity.
>>>
>>>In my case its impossible to predict when content can be changed...
>>>could be a whole lot of changes in a few minutes, and then 
>>>nothing for a few months.  This makes it very hard to simply set
>>>a time parameter as suggested in the documents.  There needs to
>>>be something a little more straightforward than that to make it
>>>usable/useful.
>>>
>>>Any more thoughts?
>>>
>>>Derek
>>>
>>>
>>>
>>>
>>>>>>cocoon@chrismaloney.com 2005/05/07 04:17:27 AM >>>
>>>
>>>I misspoke in my last email (actually, mis-wrote  :)
>>>The CInclude transformer output isn't, by default, cached, but the
>>>input 
>>>to it, "cocoon:/101.meta.xml", below, would be.  In your case, where
>>>you 
>>>have fifty or so inputs of that form, that's what you'd want.
>>
>>However,
>>
>>
>>>you can also get the CInclude transformer to cache its output, as 
>>>described here:
>>>
>>
>>
> http://cocoon.apache.org/2.1/userdocs/transformers/cinclude-transformer.html#Caching
> 
> 
>>
>>>I agree when you said "surely its Cocoon's "job" to check ....", but
>>
>>
>>>I've gotten myself into a bit of a mess, with lots of these cached 
>>>pipeline fragments daisy-chained.  I've also implemented "preemptive
>>
>>
>>>caching" on some of them, and now I just haven't spent the time to
>>
>>dig
>>
>>
>>>in and see where the cache isn't getting properly invalidated.
>>>
>>>Cheers!
>>>
>>>Derek Hohls wrote:
>>>
>>>
>>>
>>>
>>>>Chris(?)
>>>>
>>>>Thanks - I will give this a try (cannot be slower than what I have
>>>>now and looks pretty straightforward).  It had not occurred to me
>>>>because it seemed this would require 50+ calls to the same 
>>>>pipeline.. instead of just one pass through one... but I'll check.
>>>>
>>>>Re the caching - surely its Cocoon's "job" to check the files
>>>>being used (if they are static, as in this case) and send through
>>>>the latest version... but I will run some tests and see what
> 
> occurs.
> 
>>>>Thanks
>>>>Derek
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>>>cocoon@chrismaloney.com 2005/05/06 02:44:53 PM >>>
>>>>>>>     
>>>>>>>
>>>>
>>>>Is the XPath expression the same in every case ("//inml:ind/meta")?
>>>>If so, then it would be easy to switch to using CInclude, which is
>>>>cached:
>>>>
>>>><file name='101.xml'>
>>>> <ci:include src='cocoon:/101.meta.xml'/>
>>>></file>
>>>>
>>>>And then define a new pipeline to produce 101.meta.xml:
>>>>
>>>><map:match pattern='*.meta.xml'>
>>>> <map:generator src='{1}.xml'/>
>>>> <map:transform src='pull-out-ind-meta.xslt'/>
>>>> <map:serialize>
>>>></map:match>
>>>>
>>>>I'm pretty new to Cocoon, actually, and I've been using this
>>>
>>>technique
>>>
>>>
>>>
>>>>a 
>>>>lot,
>>>>for example, to generate my nav bar.  I'm not altogether happy with
>>>
>>>it,
>>>
>>>
>>>
>>>>though,
>>>>mainly because I can't figure out how to control the cache -- i.e.
>>
>>to
>>
>>
>>>>make sure
>>>>that it gets invalidated whenever {1}.xml changes.  But, it's
> 
> pretty
> 
>>>>fast.
>>>>
>>>>
>>>>Derek Hohls wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>I am looking to find a way to speed up a key step in a pipeline:
>>>>>
>>>>>The one in question has the following steps:
>>>>>
>>>>><map:match pattern="ind-list">
>>>>><map:parameter name="handler" value="myindhandler"/>
>>>>><map:generate src="inds" type="directory" include="*.xml"/>  
>>>>><map:transform src="stylesheets/ind/ind-xincludes.xsl" >
>>>>><map:parameter name="ind-dir" value="inds"/>
>>>>></map:transform>
>>>>><!-- *NOW SLOW* -->
>>>>><map:transform type="xinclude"/>
>>>>><map:serialize type="xml"/> 
>>>>></map:match> 
>>>>>
>>>>>The pipeline is fast up to the end of the first transform,
>>>>>resulting in XML which contains a number of tags of 
>>>>>the form:
>>>>>
>>>>><file name="101.xml">
>>>>><xi:include
>>>>>href="inds/101.xml#xmlns(inml=http://www.myschema.com)xpointer(//inml:ind/meta)"/>
>>>>></file>
>>>>>
>>>>>The number of tags varies by directory, but is typically about 50.
>>>>>The files themselves are small - about 50k - and the "meta"
>>>>>tags have only a few bytes of text.
>>>>>
>>>>>However, this last step takes over a minute!  on a fast server
>>>>>(2Gb memory, 3Ghz processor)... 
>>>>>
>>>>>What can I do to ensure that this is speeded up significantly...?
>>>>>ie at least a factor of 10!
>>>>>
>>>>>Thanks
>>>>>Derek
>>>>>
>>>>>
>>>>>---------------------------------------------------------------------
>>>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> 
>>>>>
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>>
> ---------------------------------------------------------------------
> 
>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>
>>>
>>>
>>
>>
> ---------------------------------------------------------------------
> 
>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>
>>>
>>>
>>
>>
> 

-- 
=====================================
NB: het Daidalos kantoor is sinds 22 april
jl. gevestigd op een nieuw adres:

Daidalos BV
Hoekeindsehof 1 - 4
2665 JZ Bleiswijk
tel: +31 (0)10 850 12 00
fax: +31 (0)10 850 11 99

Bovenstaand adres is tevens het postadres.
======================
Geert.Josten@Daidalos.nl
IT-consultant at Daidalos BV

http://www.daidalos.nl/

GPG: 1024D/12DEBB50

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message