forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Noels <>
Subject Re: [RT] Lucene integration
Date Fri, 07 Mar 2003 11:28:58 GMT
Jeff Turner wrote:
> On Fri, Mar 07, 2003 at 11:29:36AM +0100, Steven Noels wrote:
>>I'd like to give Lucene a whirl, making it a standard part of Forrest.
> Sounds good.  The Wiki search is very useful.

Just to be sure: that Wiki search comes with JSPWiki OOTB, and has 
nothing to do with Lucene, I reckon.

The Wiki isn't using any Forrest feature except for the visual 
look&feel, reimplemented in horrible JSP :-)

>>Some issues I'd like to discuss before that:
>> - making it an optional part
>>Lucene makes no sense in the CLI mode of Forrest, and I'm wondering how 
>>I could make this integration switchable:
>>    - make the post URI of that search form box parametrisable, so that 
>>people don't have to edit the skinconf to switch between CLI en webapp 
>>    - prevent the search pipelines to be accessible in CLI mode 
>>(although I shouldn't bother to much about that, I guess - the Views 
>>should make that transparent)
> Hmm. Is Lucene able to generate indexes, or is it purely a search engine?

Both, see and

> I think the Cocoon CLI sets a User-Agent header, so we could have a
> selector which uses it to send different output if the CLI is requesting
> the page.

Might be, should check how it has been done for Cocoon docs.

>> - for cleanliness purposes, I'm thinking to put this in a subsitemap: 
>>I'd like your thoughts on this, too.
> A subsitemap would be best if possible.  Over the last few days I've been
> rewriting the sitemap to be modular and strictly layered:
> LAYER 1       |   (each format or subdir handler in its own sub-sitemap)
> *.xml         |
>    various    |    docv11     faq    howto    docbook   community/*  ....
>    xml types  |       \        |       |         |         /
> -------------------------------------------------------------------------
>                          DOCUMENT-V11 INTERMEDIATE FORMAT
> -------------------------------------------------------------------------
> LAYER 2       |                /       |               \
>  Intermediate |    **body-*.xml     **menu-*.xml      **tab-*.xml  
>  HTML formats |               \        |               /
> -------------------------------------------------------------------------
> LAYER 3       |                     \|/       \|/
>   Output      |                   *.html     *.pdf
>   formats     |
> -------------------------------------------------------------------------
> The goal is to be able to add a new source format simply be dropping a
> new <format>.xmap file.  For instance, to support 'aggregate' pages
> (merging multiple XML sources), drop in a sitemap that defines
> cocoon:/merged-files.xml, and link to merged-files.html.

Looks like our slow discussion on dynamic sitemaps/pipelines has 
thoroughly infected your neurons - looking forwards to it!

> The next step is to divide the 'support' files up into modules.  Eg, only
> the dtdx.xmap file needs nekopull.jar and dtdx2flat.xsl, so that can be a
> downloadable unit.  Lucene (1.6mb unfortunately) could be another module.

I wouldn't worry too much about size. Size matters. :-P

Seriously: the thing about size which worries me most is the CLI use of 
Forrest for several projects by one user. When seeding and building a 
new project, Forrest copies across some 10 Meg of files to create the 
context. Getting rid of that, having the context reside in 
%FORREST_HOME% would be a Good Thing.

> This new sitemap mostly works, but a Cocoon bug is breaking the site:
> link resolution.  I'm currently trying to upgrade Cocoon, which is being
> a PITA.  If Lucene also needs a Cocoon upgrade you might want to wait
> till I'm done.

No sweat - looking forward to your refactoring before I get rolling!

Steven Noels                  
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at  
stevenn at                stevenn at

View raw message