lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jun Zhou" <ACP0...@sheffield.ac.uk>
Subject Re: index other document types
Date Fri, 26 Jul 2002 16:10:07 GMT
Thank you very much, Dave! So I am sure I can choose Lucene to work on my project now.

Best regards
Jun Zhou
ACP01JZ@sheffield.ac.uk

----- Original Message ----- 
From: "Dave Peixotto" <peixotto@geofolio.com>
To: "Lucene Users List" <lucene-user@jakarta.apache.org>
Sent: Friday, July 26, 2002 4:34 PM
Subject: Re: index other document types


> Lucene is very good at indexing and searching text documents.  If you need
> to index other types of documents (Word docs, PDFs, etc.) then a good
> strategy is to convert those documents to text and use Lucene to index the
> text version of the document.  If you already have a tool to convert other
> document types to text, then you should have no trouble indexing those
> documents.
> 
> ----- Original Message -----
> From: "Jun Zhou" <ACP01JZ@sheffield.ac.uk>
> To: "Lucene Users List" <lucene-user@jakarta.apache.org>
> Sent: Friday, July 26, 2002 7:52 AM
> Subject: index other document types
> 
> 
> > Dear all,
> >
> >  I learned from Lucene FAQ that if we want to index other document types,
> we need to provide a parser or extractor for every document type. I know
> there are some tools available which can convert other document types to txt
> format. Is the converter a parser or extractor at all?
> >
> >  Thank you for your kind assistance in advance.
> >
> >  Best regards
> > Jun Zhou
> > acp01jz@sheffield.ac.uk
> >
> 
> 
> --
> To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>
> 
Mime
View raw message