lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <DCutt...@grandcentral.com>
Subject RE: Lucene has moved to Jakarta
Date Fri, 05 Oct 2001 21:18:56 GMT
> From: William Wong [mailto:keng.wong@verizon.net]
> 
> How about adding filters for different file types such as
> -HTML (there is one in the demo already)
> -XML
> -PDF
> -MsWord/RTF
> -other common file formats

These would be great.  Who will implement them?
I was only listing tasks that I plan to do.

I think the best API for such converters is a method that takes a
java.io.InputStream and returns a java.io.Reader containing plain text,
e.g.:
     public static java.io.InputStream getText(java.io.Reader);
That way they can easily be used by Lucene analyzers.

Should we put converters in org.apache.lucene.document?

Contributions anyone?

Doug

Mime
View raw message