From derby-user-return-7604-apmail-db-derby-user-archive=db.apache.org@db.apache.org Wed Sep 12 15:37:58 2007 Return-Path: Delivered-To: apmail-db-derby-user-archive@www.apache.org Received: (qmail 66593 invoked from network); 12 Sep 2007 15:37:57 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Sep 2007 15:37:57 -0000 Received: (qmail 2826 invoked by uid 500); 12 Sep 2007 15:37:48 -0000 Delivered-To: apmail-db-derby-user-archive@db.apache.org Received: (qmail 2799 invoked by uid 500); 12 Sep 2007 15:37:48 -0000 Mailing-List: contact derby-user-help@db.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Reply-To: "Derby Discussion" Delivered-To: mailing list derby-user@db.apache.org Received: (qmail 2763 invoked by uid 99); 12 Sep 2007 15:37:47 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Sep 2007 08:37:47 -0700 X-ASF-Spam-Status: No, hits=1.0 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.18.43.132] (HELO sca-es-mail-1.sun.com) (192.18.43.132) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Sep 2007 15:37:44 +0000 Received: from fe-sfbay-09.sun.com ([192.18.43.129]) by sca-es-mail-1.sun.com (8.13.7+Sun/8.12.9) with ESMTP id l8CFbNSl023898; Wed, 12 Sep 2007 08:37:23 -0700 (PDT) Received: from conversion-daemon.fe-sfbay-09.sun.com by fe-sfbay-09.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0JO900601HS6DG00@fe-sfbay-09.sun.com> (original mail from Ken.Frank@Sun.COM); Wed, 12 Sep 2007 08:37:23 -0700 (PDT) Received: from [192.9.248.166] by fe-sfbay-09.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTPSA id <0JO900DGYI21UX00@fe-sfbay-09.sun.com>; Wed, 12 Sep 2007 08:37:13 -0700 (PDT) Date: Wed, 12 Sep 2007 08:37:08 -0700 From: Ken Frank Subject: Re: Derby and character set encodings In-reply-to: <56a83cd00709061251x1f9378d3q360ff877e6f1d5af@mail.gmail.com> Sender: Ken.Frank@Sun.COM To: David Van Couvering , derby-dev@db.apache.org Cc: Derby Discussion , dev@db.netbeans.org, Andrey Komarov Reply-to: Ken.Frank@Sun.COM Message-id: <46E807A4.8070100@Sun.COM> MIME-version: 1.0 Content-type: multipart/alternative; boundary="Boundary_(ID_+b61RuuGkdYyIIxo8EkOag)" References: <56a83cd00709061035k3ad0f22cse78409284cca7c3f@mail.gmail.com> <46E0440E.6040209@sun.com> <56a83cd00709061251x1f9378d3q360ff877e6f1d5af@mail.gmail.com> User-Agent: Mail/News 1.5.0.5 (X11/20060813) X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --Boundary_(ID_+b61RuuGkdYyIIxo8EkOag) Content-type: text/plain; format=flowed; charset=ISO-8859-1 Content-transfer-encoding: 7BIT the one remaining question, for the folks at derby-user (and adding derby-dev) is the first one: 1. when one creates a new derby database, is the database created with a certain encoding that will be used ? or is there an argument given to create command that can indicate the encoding to be used ? And if so, is that encoding the default encoding of the locale I am in when I run the create database command or is it utf-8 always ? (for example, for one of the Japanese locales of Solaris, the encoding of it is euc-jp) or could it be that of the encoding of the locale the actual dbase server is started in ? (which might be java's view of the users locale/encoding which would be I think the same as the OS locale user is in) that is, user might start the db server in some separate locale from where they start netbeans. Thanks - Ken =========================================================================== David Van Couvering wrote: > I think I can actually answer some of these questions :) > > On 9/6/07, Ken Frank wrote: > >> Thanks David for sending this. >> >> Let me note a few questions: >> >> 1. when one creates a new database, >> is the database created with a certain encoding that will be used ? >> >> And if so, is that encoding that of the locale I am in when I run >> the create database commands or is it utf-8 always ? >> (for example, for one of the Japanese locales of Solaris, the encoding of it >> is euc-jp) >> >> or could it be that of the encoding of the locale the actual dbase server >> is started in ? (which might be java's view of the users locale/encoding >> which would be I think the same as the OS locale user is in) >> >> I saw this from derby docs: >> "To support users in many different languages, Derby's SQL parser >> understands all Unicode characters and allows any Unicode character or >> number to be used in an identifier." >> >> but I don't know if it means that there is no concept of an encoding >> for a database itself or not. >> >> I think with Oracle for example, there is an argument to create database >> that lets one specify the encoding of it. >> >> > > This question stumps me, I'll leave it to others... > > >> 2. The locale the user is in when starting derby server - >> what things are affected by that - ie encoding of dbase, messages to >> user (if translated), time, date, etc ? >> (vs user needing to set separate variables or properties) >> >> > > I don't know what "encoding of the dbase" means, but the other display > stuff: exception messages, time and date and money formats, etc., are > all controlled by locale. > > >> 3. I think its allowed for identifiers like database names, >> table and column names, to have non ascii in them, if proper >> quoting is used when referring to them ? >> >> > > Yes, that's right. > > >> Thanks - Ken >> >> >> David Van Couvering wrote: >> >> >>> Hi, all. I am getting some questions from Ken Frank NetBeans >>> internationalization quality team about Java DB and character set >>> encodings. Rather than try and play go-between, I'm including him >>> here so he can directly ask any follow-on questions. >>> >>> Ken would like to understand how Derby makes use of character >>> encodings, and how it is affected by various settings. How does >>> Derby handle things if the encoding is set to something different from >>> our default of UTF-8? Are we impacted, or do we rely on Java routines >>> such as the Collator and Comparator class to handle this? >>> >>> Sorry if I'm talking out my ear, i18n is not one of my fortes. >>> >>> Thanks, >>> >>> David >>> >>> >>> -- ======================================== if your reply to this mail bounces, and reply was sent to kenf@, then please reply to ken.frank@sun.com instead =========================================== --Boundary_(ID_+b61RuuGkdYyIIxo8EkOag) Content-type: text/html; charset=ISO-8859-1 Content-transfer-encoding: 7BIT the one remaining question, for the folks at derby-user (and adding derby-dev) is the first one:
1.  when one creates a new derby database,
is the database created with a certain encoding that will be used ?

or is there an argument given to create command that can indicate the encoding to be used ?

And if so, is that encoding the default encoding of the locale I am in when I run
the create database command or is it utf-8 always ?
(for example, for one of the Japanese locales of Solaris, the encoding of it
is euc-jp)

or could it be that of the encoding of the locale the actual dbase server
is started in ?  (which might be java's view of the users locale/encoding
which would be I think the same as the OS locale user is in)

that is, user might start the db server in some separate locale from where they start netbeans.

Thanks - Ken
===========================================================================



David Van Couvering wrote:
I think I can actually answer some of these questions :)

On 9/6/07, Ken Frank <Ken.Frank@sun.com> wrote:
  
Thanks David for sending this.

Let me note a few questions:

1.  when one creates a new database,
is the database created with a certain encoding that will be used ?

And if so, is that encoding that of the locale I am in when I run
the create database commands or is it utf-8 always ?
(for example, for one of the Japanese locales of Solaris, the encoding of it
is euc-jp)

or could it be that of the encoding of the locale the actual dbase server
is started in ?  (which might be java's view of the users locale/encoding
which would be I think the same as the OS locale user is in)

I saw this from derby docs:
"To support users in many different languages, Derby's SQL parser
understands all Unicode characters and allows any Unicode character or
number to be used in an identifier."

but I don't know if it means that there is no concept of an encoding
for a database itself or not.

I think with Oracle for example, there is an argument to create database
that lets one specify the encoding of it.

    

This question stumps me, I'll leave it to others...

  
2.  The locale the user is in when starting derby server -
what things are affected by that - ie encoding of dbase, messages to
user (if translated), time, date, etc ?
(vs user needing to set separate variables or properties)

    

I don't know what "encoding of the dbase" means, but the other display
stuff: exception messages, time and date and money formats, etc., are
all controlled by locale.

  
3.  I think its allowed for identifiers like database names,
table and column names, to have non ascii in them, if proper
quoting is used when referring to them  ?

    

Yes, that's right.

  
Thanks - Ken


David Van Couvering wrote:

    
Hi, all.  I am getting some questions from Ken Frank NetBeans
internationalization quality team about Java DB and character set
encodings.  Rather than try and play go-between, I'm including him
here so he can directly ask any follow-on questions.

Ken would like to understand how Derby makes use of character
encodings, and how it is affected by  various settings.  How does
Derby handle things if the encoding is set to something different from
our default of UTF-8?  Are we impacted, or do we rely on Java routines
such as the Collator and Comparator class to handle this?

Sorry if I'm talking out my ear, i18n is not one of my fortes.

Thanks,

David


      

-- 
========================================
if your reply to this mail bounces,
and reply was sent to kenf@<somemachinename>,
then please reply to ken.frank@sun.com  instead
===========================================

--Boundary_(ID_+b61RuuGkdYyIIxo8EkOag)--