lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "叶双明" <yeshuangm...@gmail.com>
Subject Re: Lucene search fails for japanese characters in URL
Date Thu, 18 Sep 2008 00:43:01 GMT
You must trace the string  in each step!
Important step is get string from MYSQL and get parameter in servlet, please
check it, do you get the right string?
Chinese has the same problem too.

2008/9/17 anandsarwade <anand.sarwade@corp.aol.com>

>
> Hello Jimi,
>
> Thanks a lot for your valuable suggestion.
>
> I am using tomcat 5 . As per your suggestions ,checked the server.xml but
> found that no URIEncoding was set.
> I have set now and to my great relief :-) i could see the Lucene results on
> my browser for japanese string with request objects in UTF-8 now.
>
> Thanks again for your help.
>
> Regards,
> Anand.
>
>
> JimiH wrote:
> >
> > What webserver are you using? For example, with Tomcat, it could be
> > because of the setting URIEncoding in server.xml.
> >
> > http://tomcat.apache.org/tomcat-5.5-doc/config/http.html
> >
> > /Jimi
> >
> > mogul | jimi hullegård | system developer | hudiksvallsgatan 4, 113 30
> > stockholm sweden | +46 8 506 66 172 | +46 765 27 19 55 |
> > jimi.hullegard@mogul.com | www.mogul.com
> >
> >
> >> -----Original Message-----
> >> From: anandsarwade [mailto:anand.sarwade@corp.aol.com]
> >> Sent: den 17 september 2008 16:42
> >> To: java-user@lucene.apache.org
> >> Subject: Lucene search fails for japanese characters in URL
> >>
> >>
> >> Hi ,
> >>
> >> I am facing below problem. Please help me in this.
> >>
> >> I have integrated CJK Analyzer for Japanese characters. I am
> >> able to save
> >> japanese double byte characters in mysql database in UTF-8
> >> format without
> >> issues. I could that data is getted indexed. Now when i
> >> search the Japanese
> >> characters which were indexed using the URL below , returns
> >> empty results.
> >>
> >> http://xml.demo.myaol.jp:8082/portal/gallery-search?first=1&ma
> >> x=100&cap=言語
> >>
> >> Noticed that the above url gets converted to the following
> >> URL having some
> >> HTML encoded strings in search.
> >>
> >> http://xml.demo.myaol.jp:8082/portal/gallery-search?first=1&ma
> >> x=100&cap=%E8%A8%80%E8%AA%9E
> >>
> >> This does not match with the existing lucene indexes
> >> henceforth returns
> >> empty results.  How do i solve this lucene search issue
> >> having japanese
> >> words in URLs.? Is there any way to convert such characters
> >> back to Japanese
> >> words???
> >>
> >> Any help/suggestions in this regards is highly appreciated.
> >>
> >> Thanks in Advance.
> >>
> >> Regards,
> >> Anand
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Lucene-search-fails-for-japanese-charact
> >> ers-in-URL-tp19533647p19533647.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Lucene-search-fails-for-japanese-characters-in-URL-tp19533647p19534342.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Sorry for my english!! 明
Please help me to correct my english expression and error in syntax
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message