lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karl Øie <k...@gan.no>
Subject Re: Ask about method QueryParser.parser
Date Thu, 10 Nov 2005 09:16:03 GMT
If you have Tomcat it defaults to iso-8859-1 as character encoding i  
think, try to recode your input to utf-8 before feeding it to lucene.

s = new String(s.getBytes(),"UTF-8");

Karl

On 10. nov. 2005, at 04.10, Hai Do Thanh wrote:

> Thanks for your reply :)
>
> I have already debugged the input string s. As I said
> before, s is a string which is sent by client through
> the "doPost()" method of servlet
>
> At first, I thought that the analyzer is the cause of
> the problem and that it lowercase all leters. However,
> then, I  have also known that WhitespaceAnalyzer do
> not do that.
>
> Therefore, I changed my mind and I wonder if the
> problem is resulted from the servlet technology. Yet,
> when I debugged the input string s I saw that before
> the method, string s was still indentical to what was
> sent from client.
>
>
> --- Karl Øie <karl@gan.no> wrote:
>
>> Sounds very strange, have you debugged the input
>> string s? Where does
>> it come from?
>>
>> Karl
>>
>> On 9. nov. 2005, at 05.00, Hai Do Thanh wrote:
>>
>>> Dear all,
>>>
>>> I really appreciate your work on Lucene. It is
>>> apparently a helpful API for my project on indexed
>>> Document searching. On the whole, It works
>> properly
>>> and perfectly.
>>>
>>> However, there is a problem when I try to query
>> what I
>>> have indexed with the keyword received through
>>> internet using Servlet technology. The specific
>>> problem is the followings:
>>>
>>> // This is the method I use to parse my query
>>> Query myQuery = QueryParser.parse(s,
>> "NamedEntity",
>>> analyzer);
>>>
>>> with s is the query string received from client,
>>> analyzer is the WhitespaceAnalyzer and the version
>> of
>>> Lucene is lucene-1.4-final.jar
>>>
>>> Before this method, the value of s is "Chua_Huong"
>>> with C and H are uppercased letters
>>> After this method myQuery = NamedEntity:chua_huong
>>> with  c and h are lowercased
>>>
>>> Yet, if I type the String "Chua_Huong" directly
>> onto
>>> the first argument (the position of s),
>>>
>>> Query myQuery =
>>>
>>
> QueryParser.parse("Chua_Huong","NamedEntity",analyzer);
>>>
>>> After the method, myQuery = NamedEntity:Chua_Huong
>>> with C and H are still uppercased letters.
>>>
>>> What happened with this?
>>> Please if you can, reply to my problem as soon as
>>> possible.
>>> I would be most grateful for your helpfulness!
>>> Thanks in advance
>>>
>>> Sincerely yours
>>> Thanh Hai
>>>
>>>
>>> 		
>>> __________________________________
>>> Yahoo! FareChase: Search multiple travel sites in
>> one click.
>>> http://farechase.yahoo.com
>>>
>>>
>>
> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>>>
>>
>> - there is nothing wrong with using linux. if that's
>> the lifestyle
>> you want to live, i wont judge you. i just wont
>> support you at the
>> parades.
>>
>>
>
>
>
> 		
> __________________________________
> Yahoo! FareChase: Search multiple travel sites in one click.
> http://farechase.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

- Real life should have a search function. I need my socks.


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message