lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Liaqat Ali <liaqatalim...@gmail.com>
Subject Re: StopWords problem
Date Wed, 26 Dec 2007 20:33:28 GMT
李晓峰 wrote:
> "javac" has an option "-encoding", which tells the compiler the 
> encoding the input source file is using, this will probably solve the 
> problem.
> or you can try the unicode escape: \uxxxx, then you can save it in 
> ANSI, had for human to read though.
> or use an IDE, eclipse is a good choice, you can set the source file 
> encoding, and it will take care of the compiler for you.
>
> regards.
>> Hi, Doro Cohen
>>
>> Thanks for your reply, but I am facing a small problem over here. As 
>> I am using notepad for coding, then in which format the file should 
>> be saved.
>>
>>
>> public static final String[] URDU_STOP_WORDS = { "کے" ,"کی" ,"سے" 
>> ,"کا" ,"کو" ,"ہے" };
>>
>> Analyzer analyzer = new StandardAnalyzer(URDU_STOP_WORDS);
>>
>>
>> If I save it in ANSI format it will lose the contents, I tried 
>> Unicode but it does not work and I also tried UTF-8, but it also 
>> generate two errors of identifying two illegal characters. What 
>> should be the solution. Kindly guide in this.
>>
>> Thanks ..
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Hi,
Thanks alot for your suggestion.
Using javac -encoding UTF-8 still raises the following error.

urduIndexer.java : illegal character: \65279
?
^
1 error

What I am doing wrong?

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message