lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Hatcher" <>
Subject Re: Proposal for Lucene
Date Fri, 08 Feb 2002 13:55:55 GMT
I'm developing it for a book I'm writing on Ant, and I've posted one piece
of it here already - my HtmlDocument class that uses JTidy to DOM'ify the
HTML and rip out the title and body contents as two separate fields (without
HTML tags, of course).

I have every intention of giving all the code developed to Lucene or other
Jakarta projects where appropriate.  I only haven't yet because its still
under development - its not top secret or anything.  :)  The Ant task
definitely deserves some additional Lucene expertise to make sure its doing
the right thing, but I have it checking dependencies by embedding a
non-indexed "last modified" field into the Lucene index too which it checks
before actually indexing a document again - so a second incremental run of
indexing is *much* faster since it skips files unless they are newer.


----- Original Message -----
From: "Andrew C. Oliver" <>
To: "Lucene Developers List" <>
Sent: Friday, February 08, 2002 8:19 AM
Subject: Re: Proposal for Lucene

> Is this open source?  APL'd?  Where can I look at it?
> On Thu, 2002-02-07 at 22:00, Erik Hatcher wrote:
> > I've developed something similar myself.  I've created an Ant task
> > that uses DocumentHandler interface implementing classes - one that can
> > used (<index class="...">) is a FileExtensionDocumentHandler. At
> > I generate a Lucene index of static documents, and roll that into a web
> > application.

To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message