cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Geert Josten <Geert.Jos...@daidalos.nl>
Subject Re: Cinclude transforms very *slow* [ was Re: Speeding up an xinclude transform?]
Date Mon, 09 May 2005 12:52:59 GMT
If you fragment your pipes into different matches that get called 'after each other', Cocoon
might 
be able to cache the results in between. I think that is what made my sitemap quick on subsequent
calls.

Something like:

<map:match pattern="get/*.xml">
   <map:generate type="file" src="{global:data-dir}/{1}.xml" />
   <map:serialize type="xml" />
</map:match>

<map:match pattern="merge/all.xml">
   <map:generate type="directory" src="{global:data-dir}" />
   <map:transform src="dir2include.xsl" />
   <map:transform type="xinclude" />
   <map:serialize type="xml" />
</map:match>

<map:match pattern="process/all.xml">
   <map:generate type="file" src="cocoon://merge/all.xml" />
   <map:transform src="process.xsl" />
   <map:serialize type="xml" />
</map:match>

<map:match pattern="report/all.hml">
   <map:generate type="file" src="cocoon://process/all.xml" />
   <map:transform src="report.xsl" />
   <map:serialize type="html" />
</map:match>

By the way, I might have been using plain old XSL document() function to merge the lot. I
know for 
sure it is cached, but I believe a bit too eagier. Some people on this list refered to some
bugzilla 
report.

Oh, I somewhere recall that cinclude supported a cached-cinclude element as well. Or am I
mistaken?

Cheers,
Geert
Derek Hohls wrote:

> Geert
> 
> Thanks for the suggestion.  The XML files are quite simple: the
> inml:ind is the first tag and the meta is the first nested tag (only
> one occurrence).  I did try and drop the // from the call, without
> any noticable effect - but I did not add in the [n] values.  I can try
> 
> this too, tho', if it will make a difference...
> 
> I have also not seen any performance difference between first
> and subsequent calls (using cinlicde or xinclude) - I assume 
> because neither is cached?
> 
> Thanks
> Derek
> 
> 
>>>>Geert.Josten@daidalos.nl 2005/05/09 02:16:02 PM >>>
> 
> Hi there,
> 
> I saw someone make a remark on the xpointer/xpath expression, but never
> saw a response to it. Have 
> you tried changing the // in //inml:ind/meta to a precise path? Or is
> that impossible? And how ofter 
> does that element occur? Would it be sufficient to stop after the first
> occurrence of inml:ind or 
> meta? You might want to use something like: 
> /root-elem[1]/sub-elem1[1]/sub2[3]/sub3[2]/inml:ind[meta][1]/meta[1]
> 
> I can't oversee whether your problem has to do with caching or not. But
> I've been processing 1500+ 
> files from filesystem using Cocoon, filtering, selecting, merging and
> converting them all together. 
> And though the first call is slow, subsequest requests are fast.
> 
> I know for sure that // is a performance killer, so I think it is worth
> investigating..
> 
> Cheers,
> Geert
> 
> Derek Hohls wrote:
> 
> 
>>Chris
>>
>>Yes... the CInclude transformer output isn't cached, and I have
>>found that the approach you suggested is only marginally faster
>>than using the xinclude approach I had adopted originally.
>>
>>I guess that working with information from numerous XML files
>>on disc is not really a viable operation using Cocoon... which is,
>>I think, a pity.
>>
>>In my case its impossible to predict when content can be changed...
>>could be a whole lot of changes in a few minutes, and then 
>>nothing for a few months.  This makes it very hard to simply set
>>a time parameter as suggested in the documents.  There needs to
>>be something a little more straightforward than that to make it
>>usable/useful.
>>
>>Any more thoughts?
>>
>>Derek
>>
>>
>>
>>>>>cocoon@chrismaloney.com 2005/05/07 04:17:27 AM >>>
>>
>>I misspoke in my last email (actually, mis-wrote  :)
>>The CInclude transformer output isn't, by default, cached, but the
>>input 
>>to it, "cocoon:/101.meta.xml", below, would be.  In your case, where
>>you 
>>have fifty or so inputs of that form, that's what you'd want.
> 
> However,
> 
>>you can also get the CInclude transformer to cache its output, as 
>>described here:
>>
> 
> http://cocoon.apache.org/2.1/userdocs/transformers/cinclude-transformer.html#Caching
> 
> 
>>
>>I agree when you said "surely its Cocoon's "job" to check ....", but
> 
> 
>>I've gotten myself into a bit of a mess, with lots of these cached 
>>pipeline fragments daisy-chained.  I've also implemented "preemptive
> 
> 
>>caching" on some of them, and now I just haven't spent the time to
> 
> dig
> 
>>in and see where the cache isn't getting properly invalidated.
>>
>>Cheers!
>>
>>Derek Hohls wrote:
>>
>>
>>
>>>Chris(?)
>>>
>>>Thanks - I will give this a try (cannot be slower than what I have
>>>now and looks pretty straightforward).  It had not occurred to me
>>>because it seemed this would require 50+ calls to the same 
>>>pipeline.. instead of just one pass through one... but I'll check.
>>>
>>>Re the caching - surely its Cocoon's "job" to check the files
>>>being used (if they are static, as in this case) and send through
>>>the latest version... but I will run some tests and see what occurs.
>>>
>>>Thanks
>>>Derek
>>>
>>>
>>>
>>>
>>>
>>>>>>cocoon@chrismaloney.com 2005/05/06 02:44:53 PM >>>
>>>>>>      
>>>>>>
>>>
>>>Is the XPath expression the same in every case ("//inml:ind/meta")?
>>>If so, then it would be easy to switch to using CInclude, which is
>>>cached:
>>>
>>><file name='101.xml'>
>>>  <ci:include src='cocoon:/101.meta.xml'/>
>>></file>
>>>
>>>And then define a new pipeline to produce 101.meta.xml:
>>>
>>><map:match pattern='*.meta.xml'>
>>>  <map:generator src='{1}.xml'/>
>>>  <map:transform src='pull-out-ind-meta.xslt'/>
>>>  <map:serialize>
>>></map:match>
>>>
>>>I'm pretty new to Cocoon, actually, and I've been using this
>>
>>technique
>>
>>
>>>a 
>>>lot,
>>>for example, to generate my nav bar.  I'm not altogether happy with
>>
>>it,
>>
>>
>>>though,
>>>mainly because I can't figure out how to control the cache -- i.e.
> 
> to
> 
>>
>>>make sure
>>>that it gets invalidated whenever {1}.xml changes.  But, it's pretty
>>>fast.
>>>
>>>
>>>Derek Hohls wrote:
>>>
>>>
>>>
>>>
>>>
>>>>I am looking to find a way to speed up a key step in a pipeline:
>>>>
>>>>The one in question has the following steps:
>>>>
>>>><map:match pattern="ind-list">
>>>><map:parameter name="handler" value="myindhandler"/>
>>>><map:generate src="inds" type="directory" include="*.xml"/>  
>>>><map:transform src="stylesheets/ind/ind-xincludes.xsl" >
>>>> <map:parameter name="ind-dir" value="inds"/>
>>>></map:transform>
>>>><!-- *NOW SLOW* -->
>>>><map:transform type="xinclude"/>
>>>><map:serialize type="xml"/> 
>>>></map:match> 
>>>>
>>>>The pipeline is fast up to the end of the first transform,
>>>>resulting in XML which contains a number of tags of 
>>>>the form:
>>>>
>>>><file name="101.xml">
>>>><xi:include
>>>>href="inds/101.xml#xmlns(inml=http://www.myschema.com)xpointer(//inml:ind/meta)"/>
>>>></file>
>>>>
>>>>The number of tags varies by directory, but is typically about 50.
>>>>The files themselves are small - about 50k - and the "meta"
>>>>tags have only a few bytes of text.
>>>>
>>>>However, this last step takes over a minute!  on a fast server
>>>>(2Gb memory, 3Ghz processor)... 
>>>>
>>>>What can I do to ensure that this is speeded up significantly...?
>>>>ie at least a factor of 10!
>>>>
>>>>Thanks
>>>>Derek
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>  
>>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>
>>
>>
> 
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org 
>>For additional commands, e-mail: users-help@cocoon.apache.org 
>>
>>
>>
> 
> 

-- 
=====================================
NB: het Daidalos kantoor is sinds 22 april
jl. gevestigd op een nieuw adres:

Daidalos BV
Hoekeindsehof 1 - 4
2665 JZ Bleiswijk
tel: +31 (0)10 850 12 00
fax: +31 (0)10 850 11 99

Bovenstaand adres is tevens het postadres.
======================
Geert.Josten@Daidalos.nl
IT-consultant at Daidalos BV

http://www.daidalos.nl/

GPG: 1024D/12DEBB50

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message