lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <>
Subject Re: Content from multiple folders in single index
Date Fri, 27 Aug 2004 14:51:10 GMT
You should consider using the Ant <index> task in the Sandbox
(contributions/ant directory).  You'll need to write a custom document
handler implementation to handle PDF's and any other types you like.
The built-in handler does text and HTML files, but is pluggable.

The <index> task uses Ant's filesets to determine what should be
indexed, so you could simply have an excludes="include/" to exclude
that directory.


On Aug 25, 2004, at 7:00 PM, John Greenhill wrote:

> Hi,
> I suspect this is an easy one but I didn't see a reference in the FAQ's
> so I thought I'd ask. I have a file structure like this:
> web
>   - pages
>   - downloads (pdf docs)
>   - include
> I want to index the html in pages and the pdf's in downloads, but not
> the html in include, so I don't want to start my index at web. I've
> modified the IndexHTML in demo to do the pdf's.
> What is the best way to do this? Thanks for your suggestions.
> John

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message