Return-Path: Delivered-To: apmail-lucene-solr-dev-archive@locus.apache.org Received: (qmail 62064 invoked from network); 1 Feb 2007 22:15:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 1 Feb 2007 22:15:17 -0000 Received: (qmail 92335 invoked by uid 500); 1 Feb 2007 22:15:23 -0000 Delivered-To: apmail-lucene-solr-dev-archive@lucene.apache.org Received: (qmail 92316 invoked by uid 500); 1 Feb 2007 22:15:23 -0000 Mailing-List: contact solr-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-dev@lucene.apache.org Delivered-To: mailing list solr-dev@lucene.apache.org Received: (qmail 92307 invoked by uid 99); 1 Feb 2007 22:15:23 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Feb 2007 14:15:23 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (herse.apache.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Feb 2007 14:15:14 -0800 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id 177375B775; Thu, 1 Feb 2007 14:14:54 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id 1315D7F403 for ; Thu, 1 Feb 2007 14:14:54 -0800 (PST) Date: Thu, 1 Feb 2007 14:14:54 -0800 (PST) From: Chris Hostetter To: solr-dev@lucene.apache.org Subject: Re: resin and UTF-8 in URLs In-Reply-To: <176776ee0702011228g34bcc049lb5bdfbe242915f32@mail.gmail.com> Message-ID: References: <176776ee0701312132n72fe6f65o3455755e64b124d@mail.gmail.com> <176776ee0702011017t71f7b4adsaf342d9a271d7264@mail.gmail.com> <176776ee0702011228g34bcc049lb5bdfbe242915f32@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org : If we can do something small that makes the most normal cases work : even if the container is not configured, that seems good. but how do we know the user wants what we consider a "normal cases" to work? ... if every servlet container lets you configure your default charset differently, we have no easy way to tell if/when they've configured the default properly, to know if we should override it. If someone does everything in Shift-JIS, and sets up their servlet container with Shift-JIS as their default, and installs solr -- i don't want them to think Solr sucks because there is a default in Solr they don't know about (or know how to disable) that assumes UTF-8. On the other hand: if someone really hasn't thought about charsets at all, then it doesn't seem that bad to use whatever default their servlet container says to use -- as I understand it some containers (tomcat included) pick their default based on the JVMs configuration (i assume from the "user.language" sysproperty) ... that certainly seems like a better default then for us ot asume UTF-8 -- even if it is "latin1" for "en", because most novice users are probably okay with latin1 ... if you're starting to worry about more complex characters that aren't in the default charset your servlet container picks for you, then reading a little documentation is a good idea. : At the very lease, we should change the examples in: : http://wiki.apache.org/solr/SolrResin etc oh absolutely. -Hoss