lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From petite_abeille <>
Subject Re: indexing PDF files
Date Fri, 03 May 2002 09:35:10 GMT

On Wednesday, May 1, 2002, at 05:41 PM, Otis Gospodnetic wrote:

> Wouldn't you want to convert to XML instead and use XSLT to transform
> the XML representation to any desired format by just applying a style
> sheet?
> Sounds like less work with bigger document type coverage.

Sounds good... But what does it mean? I'm not that familiar with any of 
the XML, XSLT hype so I don't really understand what you are getting 
at... I just want to convert any type of document to text for indexing 
purpose... I'm not planning to do anything else with it... However, 
converting everything to PDF as a first step allow you to provide a 
"preview" of any documents even if you happen not to understand the 
original format (eg MS Office)...


To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message