lucene-solr-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <>
Subject [Solr Wiki] Update of "FAQ" by ThorstenScherler
Date Fri, 02 Feb 2007 13:19:56 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Solr Wiki" for change notification.

The following page has been changed by ThorstenScherler:

The comment on the change is:
Adding findings about possible encoding problems and fixes

  If you notice a problem with multibyte characters, the first step to ensuring that it is
not a true Solr bug would be to write a Unit test that bypasses the applicaiton server directly
using the [
+ The most important points are:
+  * document has to be indexed as UTF-8 encoded on the solr server. If you e.g. send a ISO
encoded document then the special ISO characters get a byte added (screwing up the final encoding,
only reindexing with UTF-8 can fix this).
+  * client need UTF-8 URL encoding when forwarding the search request to the solr server.

+ If you just forward doing:
+ {{{
+ String value = request.getParameter("q");
+ }}} to get the query string, it can be that q got encoded in ISO and then solr will not
return a search result.
+ One possible solution is:
+ {{{
+ String encoding = request.getCharacterEncoding();
+ if (null == encoding) {
+   // Set your default encoding here 
+   request.setCharacterEncoding("UTF-8");
+ } else {
+   request.setCharacterEncoding(encoding);
+ }
+ ...
+ String value = request.getParameter("q");
+ }}}
+ Another possibility is to use to transform all parameter
value to UTF-8.
  == Solr started, and i can POST documents to it, but the admin screen doesn't work ==
  The admin screens are implemented using JSPs which require a JDK (instead of just a JRE)
to be compiled on the fly.  If you encounter errors trying to load the admin pages, and the
stack traces of these errors seem to relate to compilation of JSPs, make sure you have a JDK
installed, and make sure the it is the instance of java being used.

View raw message