lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erik Hatcher <e...@ehatchersolutions.com>
Subject Re: Content from multiple folders in single index
Date Fri, 27 Aug 2004 14:51:10 GMT
You should consider using the Ant <index> task in the Sandbox
(contributions/ant directory).  You'll need to write a custom document
handler implementation to handle PDF's and any other types you like.
The built-in handler does text and HTML files, but is pluggable.

The <index> task uses Ant's filesets to determine what should be
indexed, so you could simply have an excludes="include/" to exclude
that directory.

	Erik

On Aug 25, 2004, at 7:00 PM, John Greenhill wrote:

> Hi,
>
> I suspect this is an easy one but I didn't see a reference in the FAQ's
> so I thought I'd ask. I have a file structure like this:
>
> web
>   - pages
>   - downloads (pdf docs)
>   - include
>
> I want to index the html in pages and the pdf's in downloads, but not
> the html in include, so I don't want to start my index at web. I've
> modified the IndexHTML in demo to do the pdf's.
>
> What is the best way to do this? Thanks for your suggestions.
>
> John
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message