lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Lucene support for OpenDocument?
Date Thu, 20 Jul 2006 07:39:39 GMT
Daniel Noll wrote:
> marbux wrote:
>> Hello,
>>
>> The OpenDocument Fellowship attempts to maintain a directory of
>> applicatiopns supporting OpenDocument file formats. <
>> http://www.opendocumentfellowship.org/applicationsa>. I have been
>> attempting, without success, to determine whether Lucene supports
>> OpenDocument and if so to what extent, what versions/flavors of 
>> Lucene, etc.
>> I have seen some indications searchng the development mailing list 
>> archives
>> that ODF support was being implemented, but can't find any indication 
>> that
>> the work was ever completed.
>>
>> Might someone on this list speak knowledgably to those subjects?
>
> Lucene is a text indexing library, and hence it indexes text.  For any 
> other format (HTML, Word, ODF, PDF, whatever) you have to find some 
> way to extract the text from there to feed it into Lucene.

However, a sub-project of Lucene, Nutch (http://lucene.apache.org/nutch) 
does support nearly all ODF formats through a parse-oo plugin (with the 
exception of ODG and ODB).

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message