Return-Path: Delivered-To: apmail-xml-forrest-dev-archive@xml.apache.org Received: (qmail 95900 invoked by uid 500); 7 Mar 2003 11:16:46 -0000 Mailing-List: contact forrest-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: forrest-dev@xml.apache.org Delivered-To: mailing list forrest-dev@xml.apache.org Received: (qmail 95891 invoked from network); 7 Mar 2003 11:16:46 -0000 Received: from grunt26.ihug.com.au (203.109.249.146) by daedalus.apache.org with SMTP; 7 Mar 2003 11:16:46 -0000 Received: from p116-apx1.syd.ihug.com.au (expresso.localdomain) [203.173.140.116] by grunt26.ihug.com.au with esmtp (Exim 3.35 #1 (Debian)) id 18rFqU-0002FW-00; Fri, 07 Mar 2003 22:16:58 +1100 Received: from jeff by expresso.localdomain with local (Exim 3.35 #1 (Debian)) id 18rFsU-00042z-00 for ; Fri, 07 Mar 2003 22:19:02 +1100 Date: Fri, 7 Mar 2003 22:19:02 +1100 From: Jeff Turner To: forrest-dev@xml.apache.org Subject: Re: [RT] Lucene integration Message-ID: <20030307111902.GB3905@expresso.localdomain> Mail-Followup-To: forrest-dev@xml.apache.org References: <3E687490.4060004@outerthought.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E687490.4060004@outerthought.org> User-Agent: Mutt/1.4i X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Fri, Mar 07, 2003 at 11:29:36AM +0100, Steven Noels wrote: > Folks, > > I'd like to give Lucene a whirl, making it a standard part of Forrest. Sounds good. The Wiki search is very useful. > Some issues I'd like to discuss before that: > > - making it an optional part > > Lucene makes no sense in the CLI mode of Forrest, and I'm wondering how > I could make this integration switchable: > - make the post URI of that search form box parametrisable, so that > people don't have to edit the skinconf to switch between CLI en webapp > targets > - prevent the search pipelines to be accessible in CLI mode > (although I shouldn't bother to much about that, I guess - the Views > should make that transparent) Hmm. Is Lucene able to generate indexes, or is it purely a search engine? I think the Cocoon CLI sets a User-Agent header, so we could have a selector which uses it to send different output if the CLI is requesting the page. > - for cleanliness purposes, I'm thinking to put this in a subsitemap: > I'd like your thoughts on this, too. A subsitemap would be best if possible. Over the last few days I've been rewriting the sitemap to be modular and strictly layered: LAYER 1 | (each format or subdir handler in its own sub-sitemap) *.xml | various | docv11 faq howto docbook community/* .... xml types | \ | | | / ------------------------------------------------------------------------- DOCUMENT-V11 INTERMEDIATE FORMAT ------------------------------------------------------------------------- LAYER 2 | / | \ Intermediate | **body-*.xml **menu-*.xml **tab-*.xml HTML formats | \ | / ------------------------------------------------------------------------- LAYER 3 | \|/ \|/ Output | *.html *.pdf formats | ------------------------------------------------------------------------- The goal is to be able to add a new source format simply be dropping a new .xmap file. For instance, to support 'aggregate' pages (merging multiple XML sources), drop in a sitemap that defines cocoon:/merged-files.xml, and link to merged-files.html. The next step is to divide the 'support' files up into modules. Eg, only the dtdx.xmap file needs nekopull.jar and dtdx2flat.xsl, so that can be a downloadable unit. Lucene (1.6mb unfortunately) could be another module. This new sitemap mostly works, but a Cocoon bug is breaking the site: link resolution. I'm currently trying to upgrade Cocoon, which is being a PITA. If Lucene also needs a Cocoon upgrade you might want to wait till I'm done. --Jeff > > -- > Steven Noels http://outerthought.org/ > Outerthought - Open Source, Java & XML Competence Support Center > Read my weblog at http://blogs.cocoondev.org/stevenn/ > stevenn at outerthought.org stevenn at apache.org >