lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ppuyen <>
Subject indexing html and pdf files of Russian language
Date Fri, 12 Dec 2008 01:28:56 GMT

Hi, everybody.

I have a problem. I already did:
1. Indexing russian language with text file  successfully
2. Iindexing pdf, html, ppt file with StandarAnalyze successfully . 
But now I need to do indexing files html and pdf ... format with Russian
language but  it's only indexing Text file. Didn't do with  HTML or PDF (I
run debug , when indexing file html with Russian language ,it showed
unreadable character). 
Who can tell me why  ? and help  How i can indexing file HTML, PDF...
Russian language file ? 

thanks a lot . 
View this message in context:
Sent from the Lucene - General mailing list archive at

View raw message