lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pasha Bizhan <fc...@ok.ru>
Subject Re: Word Documents
Date Mon, 15 Dec 2003 14:35:57 GMT
Hi,

> "Gregor Heinrich" <gregor.heinrich@igd.fraunhofer.de> wrote:
>
>we had some problems using the POI Word filter. In one 
>document set,
>everything would work fine, in another more than 50% 
>documents refused to
>work with it (does not index). 

Btw, the word document may be saved in two modes - fast 
save mode  and full save mode. It's different file format. 
The fast save mode file format is very strange and hard 
for text extraction. 

May be your files saved in fast save mode.You can try to 
save it in full save mode. It's only idea.

Pasha 
http://sf.net/projects/lucenedotnet

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message