cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ard Schrijvers" <a.schrijv...@hippo.nl>
Subject New HighlightingTransformer
Date Fri, 30 Sep 2005 11:39:30 GMT
Hello everybody,

I have been working on transformers for highlighting words in an XML document. It wraps words
within an xml document with parameterizable tags and attributes. 

Currently, I have created two transformers:
1) Highlighting a single keyword specified with a parameter
2) Highlighting multiple keywords and setting automatic keywords on words according to an
extra file (src attr in the transformer fetched with the SourceResolver protocol) containing
keywords and links, like a disctionary or a thesarus.

The transformers are parameterizable and have default parameter values:

For example, 

<map:transform type="highlightkeywordstransformer" src="keywords10000.xml"/>

would search XML from within the "body" element and wraps found keywords with <a href="if_link_found">keyword</a>

But, 
<map:transform type="highlightkeywordstransformer" src="keywords10000.xml">
	<map:parameter name="containerElement" value="div"/>
	<map:parameter name="containerElementId" value="maincontent"/>
	<map:parameter name="wrapElement" value="a"/>
	<map:parameter name="wrapAttributeClass" value="thisClass"/>
	<map:parameter name="wrapAttributeStyle" value="color:green"/>
</map:transform>

Would only highlight keywords found with containerElement(s) "div" with "id=maincontent".
wrapAttributeXXX value="YYY" will be attribute in the wrapElement as <wrapElement XXX="YYY">
(thus wrapAttributeClass is translated to attr class)

Test Results (Windows, 2,8 GHz, 1G memory  ): 
1) Highlighting 100 kb XML single keyword highlighting: ~35 ms
2) - 10 kb XML, 1000 keywords with links : ~31 ms 
   - 10 kb XML, 10000 keywords with links: ~94 ms 
   - 10 kb XML, 200.000 keywords with links: ~1,5 s

To do:
1) Implement CacheableProcessingComponent
2) Implement java.text.BreakIterator (at the moment I seperate words by " ", of course dirty)

I would like to donate these 2 transformers, shall I send in a patch?

Ard Schrijvers

Hippo
Oosteinde 11
1017WT
Amsterdam
The Netherlands

Telefoon: +31(0)20-5224466 
Fax:      +31(0)20-5224467 
-------------------------------------------------------------
a.schrijvers@hippo.nl / www.hippo.nl
--------------------------------------------------------------

Mime
View raw message