lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "叶双明" <yeshuangm...@gmail.com>
Subject Re: Lucene search fails for japanese characters in URL
Date Fri, 19 Sep 2008 00:53:18 GMT
I still suggest you to setup and test a standalone IndexSearcher though you
believe it should work.
If it work, and tomcat get the right parameter, sorry, i don't know what is
the problem.

2008/9/18 anandsarwade <anand.sarwade@corp.aol.com>

>
> This Luke tool seems to be pretty cool. I have installed and its very easy
> to
> find out the indexes and what is being stored. thanks for this info.
>
> I have tried in tomcat and things works fine without issues. Default
> operator is OR in my case. i havent tried with setting up stanalone
> indexsearcher but i believe it should work. Please let me know if any
> issues.
>
>
> 叶双明 wrote:
> >
> > And, you can use Tool luke to see what is in the index indeed.
> > what is in the Query which put into IndexSearcher.search(), what is the
> > defaultOperatoer of QueryParser.
> >
> > Can you get hits by setup a simple IndexSearcher, no through  tomcat?
> >
> > 2008/9/18 anandsarwade <anand.sarwade@corp.aol.com>
> >
> >>
> >> Hi,
> >>
> >> I do get the same string from Mysql and also in servlet request. I could
> >> observe the actaul string in eclipse while debugging. it is stored as
> >> UTF-8
> >> format so retrievel is coming as stored.
> >>
> >> plz let me know if iam not clear
> >>
> >>
> >> 叶双明 wrote:
> >> >
> >> > You must trace the string  in each step!
> >> > Important step is get string from MYSQL and get parameter in servlet,
> >> > please
> >> > check it, do you get the right string?
> >> > Chinese has the same problem too.
> >> >
> >> > 2008/9/17 anandsarwade <anand.sarwade@corp.aol.com>
> >> >
> >> >>
> >> >> Hello Jimi,
> >> >>
> >> >> Thanks a lot for your valuable suggestion.
> >> >>
> >> >> I am using tomcat 5 . As per your suggestions ,checked the server.xml
> >> but
> >> >> found that no URIEncoding was set.
> >> >> I have set now and to my great relief :-) i could see the Lucene
> >> results
> >> >> on
> >> >> my browser for japanese string with request objects in UTF-8 now.
> >> >>
> >> >> Thanks again for your help.
> >> >>
> >> >> Regards,
> >> >> Anand.
> >> >>
> >> >>
> >> >> JimiH wrote:
> >> >> >
> >> >> > What webserver are you using? For example, with Tomcat, it could
be
> >> >> > because of the setting URIEncoding in server.xml.
> >> >> >
> >> >> > http://tomcat.apache.org/tomcat-5.5-doc/config/http.html
> >> >> >
> >> >> > /Jimi
> >> >> >
> >> >> > mogul | jimi hullegård | system developer | hudiksvallsgatan
4, 113
> >> 30
> >> >> > stockholm sweden | +46 8 506 66 172 | +46 765 27 19 55 |
> >> >> > jimi.hullegard@mogul.com | www.mogul.com
> >> >> >
> >> >> >
> >> >> >> -----Original Message-----
> >> >> >> From: anandsarwade [mailto:anand.sarwade@corp.aol.com]
> >> >> >> Sent: den 17 september 2008 16:42
> >> >> >> To: java-user@lucene.apache.org
> >> >> >> Subject: Lucene search fails for japanese characters in URL
> >> >> >>
> >> >> >>
> >> >> >> Hi ,
> >> >> >>
> >> >> >> I am facing below problem. Please help me in this.
> >> >> >>
> >> >> >> I have integrated CJK Analyzer for Japanese characters. I
am
> >> >> >> able to save
> >> >> >> japanese double byte characters in mysql database in UTF-8
> >> >> >> format without
> >> >> >> issues. I could that data is getted indexed. Now when i
> >> >> >> search the Japanese
> >> >> >> characters which were indexed using the URL below , returns
> >> >> >> empty results.
> >> >> >>
> >> >> >> http://xml.demo.myaol.jp:8082/portal/gallery-search?first=1&ma
> >> >> >> x=100&cap=言語
> >> >> >>
> >> >> >> Noticed that the above url gets converted to the following
> >> >> >> URL having some
> >> >> >> HTML encoded strings in search.
> >> >> >>
> >> >> >> http://xml.demo.myaol.jp:8082/portal/gallery-search?first=1&ma
> >> >> >> x=100&cap=%E8%A8%80%E8%AA%9E
> >> >> >>
> >> >> >> This does not match with the existing lucene indexes
> >> >> >> henceforth returns
> >> >> >> empty results.  How do i solve this lucene search issue
> >> >> >> having japanese
> >> >> >> words in URLs.? Is there any way to convert such characters
> >> >> >> back to Japanese
> >> >> >> words???
> >> >> >>
> >> >> >> Any help/suggestions in this regards is highly appreciated.
> >> >> >>
> >> >> >> Thanks in Advance.
> >> >> >>
> >> >> >> Regards,
> >> >> >> Anand
> >> >> >>
> >> >> >> --
> >> >> >> View this message in context:
> >> >> >> http://www.nabble.com/Lucene-search-fails-for-japanese-charact
> >> >> >> ers-in-URL-tp19533647p19533647.html
> >> >> >> Sent from the Lucene - Java Users mailing list archive at
> >> Nabble.com.
> >> >> >>
> >> >> >>
> >> >> >>
> >> ---------------------------------------------------------------------
> >> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >> >>
> >> >> >>
> >> >> >
> >> >> >
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >>
> >>
> http://www.nabble.com/Lucene-search-fails-for-japanese-characters-in-URL-tp19533647p19534342.html
> >> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >> >
> >> > --
> >> > Sorry for my english!! 明
> >> > Please help me to correct my english expression and error in syntax
> >> >
> >> >
> >>
> >> --
> >> View this message in context:
> >>
> http://www.nabble.com/Lucene-search-fails-for-japanese-characters-in-URL-tp19533647p19547081.html
> >> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
> >
> > --
> > Sorry for my english!! 明
> > Please help me to correct my english expression and error in syntax
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Lucene-search-fails-for-japanese-characters-in-URL-tp19533647p19549854.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Sorry for my english!! 明
Please help me to correct my english expression and error in syntax
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message