Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 26611 invoked from network); 14 Sep 2007 08:15:15 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Sep 2007 08:15:15 -0000 Received: (qmail 29310 invoked by uid 500); 14 Sep 2007 08:14:59 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 29288 invoked by uid 500); 14 Sep 2007 08:14:59 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 29267 invoked by uid 99); 14 Sep 2007 08:14:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Sep 2007 01:14:59 -0700 X-ASF-Spam-Status: No, hits=-99.8 required=10.0 tests=ALL_TRUSTED,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Sep 2007 08:16:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 2F03771417D for ; Fri, 14 Sep 2007 01:14:32 -0700 (PDT) Message-ID: <10305609.1189757672179.JavaMail.jira@brutus> Date: Fri, 14 Sep 2007 01:14:32 -0700 (PDT) From: "Mamta A. Satoor (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED In-Reply-To: <24303214.1185209731243.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12527398 ] Mamta A. Satoor commented on DERBY-2967: ---------------------------------------- I talked with Dan briefly on my last patch(temp_diff.txt) and found that I was unaware of the fact that 2 or more different characters in a territory can have same collation element(s) value associated with them. ***********************\ (http://www.unicode.org/reports/tr10/ Section 1.9.2 Non-Goals The Default Unicode Collation Element Table explicitly does not provide for the following features: reversibility: from a Collation Element you are not guaranteed that you can recover the original character. ***********************/ So, taking a fictious territory as an eg, it is possible that metacharacter '_''s collation element could have same value as say character 'a'. Because of this, my approach to construct a CollationElementIterator on the entire pattern string will not yield expected results. I have worked on a new patch based on this fact and will post it in another couple minutes. Please disregard the earlier patch temp_diff.txt and temp_stat.txt > Single character does not match high value unicode character with collation TERRITORY_BASED > ------------------------------------------------------------------------------------------- > > Key: DERBY-2967 > URL: https://issues.apache.org/jira/browse/DERBY-2967 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.4.0.0 > Reporter: Kathey Marsden > Assignee: Mamta A. Satoor > Attachments: temp_diff.txt, temp_stat.txt, TestFrench.java, TestNorway.java > > > With TERRITORY_BASED collation '_' does not match the character \uFA2D. It is the same for english or norwegian. FOR collation UCS_BASIC it matches fine. Could you tell me if this is a bug? > Here is a program to reproduce. > import java.sql.*; > public class HighCharacter { > public static void main(String args[]) throws Exception > { > System.out.println("\n Territory no_NO"); > Class.forName("org.apache.derby.jdbc.EmbeddedDriver"); > Connection conn = DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Territory en_US"); > conn = DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Collation USC_BASIC"); > conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true"); > testLikeWithHighestValidCharacter(conn); > } > public static void testLikeWithHighestValidCharacter(Connection conn) throws SQLException { > Statement stmt = conn.createStatement(); > try { > stmt.executeUpdate("drop table t1"); > }catch (SQLException se) > {// drop failure ok. > } > stmt.executeUpdate("create table t1(c11 int)"); > stmt.executeUpdate("insert into t1 values 1"); > > // \uFA2D - the highest valid character according to > // Character.isDefined() of JDK 1.4; > PreparedStatement ps = > conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?"); > String[] match = { "%", "_", "\uFA2D" }; > for (int i = 0; i < match.length; i++) { > System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]); > ps.setString(1, match[i]); > ResultSet rs = ps.executeQuery(); > if( rs.next() && rs.getString(1).equals("1")) > System.out.println("PASS"); > else System.out.println("FAIL: no match"); > rs.close(); > } > } > } > Mamta made some comments on this issue in the following thread: > http://www.nabble.com/Single-character-does-not-match-high-value-unicode-character-with-collation-TERRITORY_BASED.-Is-this-a-bug-tf4118767.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.