lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erik Hatcher" <li...@ehatchersolutions.com>
Subject Re: Proposal for Lucene
Date Fri, 08 Feb 2002 13:55:55 GMT
I'm developing it for a book I'm writing on Ant, and I've posted one piece
of it here already - my HtmlDocument class that uses JTidy to DOM'ify the
HTML and rip out the title and body contents as two separate fields (without
HTML tags, of course).

I have every intention of giving all the code developed to Lucene or other
Jakarta projects where appropriate.  I only haven't yet because its still
under development - its not top secret or anything.  :)  The Ant task
definitely deserves some additional Lucene expertise to make sure its doing
the right thing, but I have it checking dependencies by embedding a
non-indexed "last modified" field into the Lucene index too which it checks
before actually indexing a document again - so a second incremental run of
indexing is *much* faster since it skips files unless they are newer.

    Erik

----- Original Message -----
From: "Andrew C. Oliver" <acoliver@apache.org>
To: "Lucene Developers List" <lucene-dev@jakarta.apache.org>
Sent: Friday, February 08, 2002 8:19 AM
Subject: Re: Proposal for Lucene


> Is this open source?  APL'd?  Where can I look at it?
>
> On Thu, 2002-02-07 at 22:00, Erik Hatcher wrote:
> > I've developed something similar myself.  I've created an Ant task
<index>
> > that uses DocumentHandler interface implementing classes - one that can
be
> > used (<index class="...">) is a FileExtensionDocumentHandler. At
build-time
> > I generate a Lucene index of static documents, and roll that into a web
> > application.



--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message