lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From petite_abeille <petite_abei...@mac.com>
Subject Re: indexing PDF files
Date Fri, 03 May 2002 09:35:10 GMT

On Wednesday, May 1, 2002, at 05:41 PM, Otis Gospodnetic wrote:

> Wouldn't you want to convert to XML instead and use XSLT to transform
> the XML representation to any desired format by just applying a style
> sheet?
> Sounds like less work with bigger document type coverage.

Sounds good... But what does it mean? I'm not that familiar with any of 
the XML, XSLT hype so I don't really understand what you are getting 
at... I just want to convert any type of document to text for indexing 
purpose... I'm not planning to do anything else with it... However, 
converting everything to PDF as a first step allow you to provide a 
"preview" of any documents even if you happen not to understand the 
original format (eg MS Office)...

PA


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message