lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: problems with search on Russian content
Date Thu, 21 Nov 2002 15:35:50 GMT
Look at CHANGES.txt document in CVS - there is some new stuff in
org.apache.lucene.analysis.ru package that you will want to use.
Get the Lucene from the nightly build...

Otis

--- Andrey Grishin <grishin@softline.kiev.ua> wrote:
> Hi All, 
> I have a problems with searching on Russian content using lucene 1.2
> 
> I indexed the content using Cp1251 charset
> ------------
> text = new String(text.getBytes("Cp1251"));
> doc.add(Field.Text(CONTENT_FIELD,text));
> 
> ------------
> and I am searching using the same charset
> 
> String txt = "";
> txt = new String(txt.getBytes("Cp1251"));
> PrefixQuery query = new PrefixQuery(new
> Term(PortalHTMLDocument.CONTENT_FIELD, txt));
> hits = searcher.search(query);
> 
> or 
> 
> Analyzer analyzer = new StandardAnalyzer();
> String txt = "";
> txt = new String(txt.getBytes("Cp1251"));
> Query query = QueryParser.parse(txt,
> PortalHTMLDocument.CONTENT_FIELD, analyzer);
> 
> hits = searcher.search(query);
> 
> 
> and lucene can't find nothing.
> Also I checked for the DecodeInterceptor in my server.xml - there
> isn't any
> 
> I tried UTF-8/16 - and got the same result.
> 
> Also, if I list all index's content via iterating IndexReader - I can
> see that my russian content is stored in index...
> Can you please help me? Do you have any more ideas about what else
> can be done here to fix this problem?
> 
> I will appreciate any help.
> Thanks, Andrey.
> 
> P.S.
> I am using lucene 1.2, tomcat 4.1.12, jdk 1.4.1 on Win2000 AS


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus  Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message