lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harpreet S Walia" <xalanl...@yahoo.com>
Subject Re: Problem with Unicode !!!
Date Sun, 01 Apr 2001 16:05:31 GMT
Hi,

We are currently testing with plain text unicode files made in windows 2000
notepad. The files are in English and devanagri (Hindi) .
Now , the english words when shown in the results come properly but the
devanagri words come as junk although they are searchable.

As far as encoding goes we have saved the files in utf-8 in windows . Also
we have not worked extensively on lucene so we would be greatful if you can
tell us how to check the encoding of the lucene index file.

Thanks And Regards
Harpreet


> Hi,
>
> Please give an example of what you have indexed and what you are searching
> on.
>
> Also, are you sure the text you are searching for has been encoded
properly?
>
> --Peter
>
>
> On 4/1/02 3:59 AM, "eeed wewefwf" <xalanlist@yahoo.com> wrote:
>
> > Hi,
> >
> > I am working on lucene to index unicode content. I am
> > facing the following problems .
> >
> > 1) I am creating a index where i am adding  two fields
> > in the index without specifying any encoding. one
> > field is title and the other is body
> >
> > e.g :- doc.add(Field.Text("body",(Reader)isr));
> > where isr is my InputStream Reader.
> >
> > Now i am able to search for the words but the title
> > when displayed in the browser shows junk. inspite of
> > the correct encoding in the browser (UTF-8).
> >
> > 2) I also tried specifying enconding in the
> > InputStreamReader . In this case the title comes
> > properly but i am not able to search non english
> > words.
> >
> > I am trying this on a win2000 machine.
> >
> > I would really appriciate some help
> >
> > TIA,
> >
> > Regards
> > Harpreet.
> >
> > __________________________________________________
> > Do You Yahoo!?
> > Yahoo! Greetings - send holiday greetings for Easter, Passover
> > http://greetings.yahoo.com/
> >
> > --
> > To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> > For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>
> >
> >
>
>
> --
> To unsubscribe, e-mail:
<mailto:lucene-user-unsubscribe@jakarta.apache.org>
> For additional commands, e-mail:
<mailto:lucene-user-help@jakarta.apache.org>


--
To unsubscribe, e-mail:   <mailto:lucene-user-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-user-help@jakarta.apache.org>


Mime
View raw message