lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Høydahl <>
Subject Re: Unsupported encoding GB18030
Date Mon, 04 Apr 2011 11:36:38 GMT
>>> : I don't see the reason why "exampledocs" should contain docs with narrow
>>> charsets not guaranteed to be supported.
>> personally i would like to see us add a lot more exampledocs in a lot more
>> esoteric encodings, precisely to help end users sanity test this sort of
>> we frequetnly get questions form people about character encoding
>> wonkiness, and things like, utf8-example.xml, and now
>> gb18030-example.xml can help us narrow down the problem: their client
>> code, their servlet container, or solr?
> Same here. In my opinion, an example set of files should also contain "more
> complicated" ones to show what Solr can do. If some of them don't work, it's
> not really a problem. Maybe we should simply add a "tag" to the filename to
> mark them as not working in every configuration.

Positive to more example docs!

My concern was that since indexing exampledocs/*.xml is perhaps THE most common action any
new Solr user will do, it should just work, and it's a benefit if the results revolve around
the same theme, a set of products with category and prices. We definitely want to show off
more advanced features, and we should add more example documents for that. Plain test docs
could be placed in a a subfolder "exampledocs/extras" or something.

Regarding the WindowsXP VMmware I was using, it had a Sun JRE (not JDK) which was auto-updated
from 1.5 to 1.6.
After completely uninstalling Java and re-installing jdk-6u24-windows-i586.exe the GB18030
encoding is supported.

Jan Høydahl, search solution architect
Cominvent AS -

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message