lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: Unsupported encoding GB18030
Date Mon, 04 Apr 2011 15:06:01 GMT
To come back to the original issue:
If you are using a pure JRE installed in your operating system using the
standard mechanism "browser automatically installs Java Plugin methods" or
similar, the following applies:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6329080

To reduce size of downloads, the JRE-only installation does not contain the
full charsets.jar, so the problem is expected. In fact, those JRE's only
contain the basic charsets as Robert told and the ones needed for your area
(it analyzes your environment in the installer and chooses between western,
eastern and possibly others to download only the corresponding
charsets.jar).

We should maybe add a note to Solr, that you should in all cases use a full
locale JRE installation or better a JDK, else the full international
functionality of Solr cannot be used.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Jan Høydahl [mailto:jan.asf@cominvent.com]
> Sent: Monday, April 04, 2011 1:37 PM
> To: dev@lucene.apache.org
> Subject: Re: Unsupported encoding GB18030
> 
> >>> : I don't see the reason why "exampledocs" should contain docs with
> >>> narrow charsets not guaranteed to be supported.
> >> personally i would like to see us add a lot more exampledocs in a lot
> >> more esoteric encodings, precisely to help end users sanity test this
> >> sort of we frequetnly get questions form people about character
> >> encoding wonkiness, and things like test_utf8.sh, utf8-example.xml,
> >> and now gb18030-example.xml can help us narrow down the problem:
> >> their client code, their servlet container, or solr?
> >
> > Same here. In my opinion, an example set of files should also contain
> > "more complicated" ones to show what Solr can do. If some of them
> > don't work, it's not really a problem. Maybe we should simply add a
> > "tag" to the filename to mark them as not working in every
configuration.
> 
> Positive to more example docs!
> 
> My concern was that since indexing exampledocs/*.xml is perhaps THE most
> common action any new Solr user will do, it should just work, and it's a
> benefit if the results revolve around the same theme, a set of products
with
> category and prices. We definitely want to show off more advanced
> features, and we should add more example documents for that. Plain test
> docs could be placed in a a subfolder "exampledocs/extras" or something.
> 
> Regarding the WindowsXP VMmware I was using, it had a Sun JRE (not JDK)
> which was auto-updated from 1.5 to 1.6.
> After completely uninstalling Java and re-installing jdk-6u24-windows-
> i586.exe the GB18030 encoding is supported.
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message