db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta Satoor" <msat...@gmail.com>
Subject Re: Collation feature discussion
Date Fri, 30 Mar 2007 18:55:45 GMT
Rick, you and Dan have the same concern about the dynamic nature of current
schema's character set association with string literal. The main reason
behind my thinking was that I didn't want the metadata queries to break in
the absence of COLLATE clause. But I had a quick off the list talk with Dan
about metadata queries and user queries and he had an excellent solution to
the problem. He recommended using the CAST function which will be able to
provide the correct character set association in metadata queries and user
queries going against system tables.

So, the string literal will always have the USER character set associated
with them (and not current schema's character set).

The metadata queries which do system character column comparison against
string literal can be rewritten as syscharcol = CAST(string_literal as
varchar(128)). Since CAST result character string type will pick up the
current schema's character set, the CAST's character set will become
UCS_BASIC. This way, the metadata queries will work fine because we have the
same character set and collation associated on each side of the comparison.

The user quries accessing system character columns can also use this CAST to
their benefit. eg CAST(syscharcol as varchar(128)) = user_char_col. Both the
sides of the comparison will have USER character set associated with them.

So, the only change I am suggesting from my mail on March 29th is that
string literal will always have the USER character set associated with them.

In addition, the answer Dan's question below is yes.
"Does 3) cover JDBC parameters as well, (ie. ?) where the type of the
parameter is a character type?"

If this covers everything with no further issues, then I will do one final
write up on collation derivation and collation type and put it on the wiki
page http://wiki.apache.org/db-derby/BuiltInLanguageBasedOrderingDERBY-1478

Thanks to everyone for putting in energy to help me arrive on this final
writeup step of collation derivation.

On 3/30/07, Rick Hillegas <Richard.Hillegas@sun.com> wrote:
> Mamta Satoor wrote:
> > lots of thoughtful analysis...
> >
> >
> > The character set specification for string literals is not as well
> > defined as for <column definitions> but my proposal here will work
> > within SQL spec boundaries. SQL spec Section 5.3<literal>, Syntax Rule
> > 14b says that if the character set is not specified for character
> > string literal, then character string literal's character set will be
> > the character set of the SQL-client module. Derby does not implement
> > SQL-client module, but definition of SQL-client module in Section 13.1
> > says that SQL-client module definition has mandatory <module name
> > clause> which is defined in Section 13.2 <module name clause>. The
> > Syntax Rule 4 in this section says that if a character set is not
> > specified for the SQL-client module, then it's character set is
> > implementation-defined. I think we can use this implementation-defined
> > character set for a SQL-client module to our advantage. We can define
> > Derby's implementation-defined character set for SQL-client module as
> > current schema's character set and hence the current schema's
> > character set will become string literal's character set. Hence, if
> > the string literal reference is made while the current schema is user
> > schema, then the character set associated with string literal will be
> > USER and if the string literal reference is made while the current
> > schema is system schema, then the character set associated with the
> > string literal will be SQL_IDENTIFER. Going this path will also make
> > sure that our metadata queries don't break.
> >
> I'm a bit uneasy about this interpretation. It seems to me that the
> string literal has a default character set and collation. Those are
> determined at the database level. They can be overridden at the session
> level and the client-module level. I think your interpretation of
> 13.2/SR4 is a bit adventurous. To me, 13.2/SR4 says that there is a
> single, implementation-defined implicit character set for a
> client-module and not a number of different implicit character sets
> which shift based on your current schema.
> Regards,
> -Rick

View raw message