lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jay <>
Subject RE: many analyzers, same index.
Date Mon, 22 Oct 2001 11:54:17 GMT

>The better approach is
> to implement converters
> that convert these formats to plain text, either a
> String or a Reader.  Then
> you can use the same analyzer for documents in
> different formats.

Has anyone tried implimenting 3rd party open source
utilities to do this?  xpdf (
converts pdf to text and catdoc
converts ms word to text.  Maybe these can be used to
create the plain text for the index...

I look forward to seeing PDF and Word indexing added
to this solution.

My Best;


Do You Yahoo!?
Make a great connection at Yahoo! Personals.

View raw message