Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 62360 invoked from network); 18 Oct 2007 15:59:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Oct 2007 15:59:21 -0000 Received: (qmail 35683 invoked by uid 500); 18 Oct 2007 15:59:09 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 35476 invoked by uid 500); 18 Oct 2007 15:59:08 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 35466 invoked by uid 99); 18 Oct 2007 15:59:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Oct 2007 08:59:08 -0700 X-ASF-Spam-Status: No, hits=-99.8 required=10.0 tests=ALL_TRUSTED,WHOIS_MYPRIVREG X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 18 Oct 2007 15:59:11 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id B5E0C71420D for ; Thu, 18 Oct 2007 08:58:50 -0700 (PDT) Message-ID: <4695791.1192723130735.JavaMail.jira@brutus> Date: Thu, 18 Oct 2007 08:58:50 -0700 (PDT) From: "Mamta A. Satoor (JIRA)" To: derby-dev@db.apache.org Subject: [jira] Commented: (DERBY-2967) Single character does not match high value unicode character with collation TERRITORY_BASED In-Reply-To: <24303214.1185209731243.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/DERBY-2967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12535953 ] Mamta A. Satoor commented on DERBY-2967: ---------------------------------------- Migrated changes from trunk(revision 585261) into 10.3(revision 586019) codeline with following commit comments Migrating changes (revision 585261) from trunk into 10.3 codeline. I had to make some manual changes after merging because CollationTest had few changes in main which were not part of 10.3 codeline. The commit comments for the trunk codeline checkin were as follows Commiting the patch (DERBY2967_Oct11_07_diff.txt) attached to DERBY-2967. The implementation of LIKE for UCS_BASIC and territory based character string types do not differ much(based on SQL standard as explained in comments to this Jira entry). I have been able to change the existing code for LIKE (in Like.java) for UCS_BASIC character strings to support territory based character strings. The existing method in Like.java now gets a new parameter and it is RuleBasedCollator. For UCS_BASIC strings, this will be passed as NULL. We check if the RuleBasedCollator is NULL and if so then we do simple one character equality check for non-metacharacters in pattern and correspnding characters in value string. But if RuleBasedCollator is not NULL, then we use it to get collation element(s) for one character at a time for non-metacharacters in patterns and corresponding characters in value string and do the collation element(s) comparison to establish equality. In addition to the above mentioned change in Like.java, I have changed the callers of the method in Like.java to pass correct value for the RuleBasedCollator. Additionally, I have added a test to CollationTest.java for the code changes. Existing like tests in CollationTest2.java were very useful in the testing of my changes. And lastly, I changed few of the existing tests to use different character string values so that when we run the full collation tests, we do not see some of the test failures which are genuine because of the nature of their data. M java/engine/org/apache/derby/iapi/types/Like.java M java/engine/org/apache/derby/iapi/types/SQLChar.java M java/engine/org/apache/derby/iapi/types/WorkHorseForCollatorDatatypes.java M java/testing/org/apache/derbyTesting/unitTests/lang/T_Like.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/CollationTest2.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/DynamicLikeOptimizationTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/lang/StreamsTest.java M java/testing/org/apache/derbyTesting/functionTests/tests/nist/dml068.sql M java/testing/org/apache/derbyTesting/functionTests/tests/nist/xts729.sql M java/testing/org/apache/derbyTesting/functionTests/master/dml068.out M java/testing/org/apache/derbyTesting/functionTests/master/xts729.out > Single character does not match high value unicode character with collation TERRITORY_BASED > ------------------------------------------------------------------------------------------- > > Key: DERBY-2967 > URL: https://issues.apache.org/jira/browse/DERBY-2967 > Project: Derby > Issue Type: Bug > Components: SQL > Affects Versions: 10.4.0.0 > Reporter: Kathey Marsden > Assignee: Mamta A. Satoor > Attachments: DERBY2967_Oct11_07_diff.txt, DERBY2967_Oct11_07_stat.txt, DERBY2967_offset_based_diff_Oct02_07.txt, DERBY2967_offset_based_stat_Oct02_07.txt, fullcoll.out, patch2_setOffset_fullcoll.out, patch2_with_setOffset_diff_Sep2007.txt, patch2_with_setOffset_stat_Sep2007.txt, step1_iteratorbased_Sep1507_diff.txt, step1_iteratorbased_Sep1507_stat.txt, temp_diff.txt, temp_stat.txt, TestFrench.java, TestNorway.java > > > With TERRITORY_BASED collation '_' does not match the character \uFA2D. It is the same for english or norwegian. FOR collation UCS_BASIC it matches fine. Could you tell me if this is a bug? > Here is a program to reproduce. > import java.sql.*; > public class HighCharacter { > public static void main(String args[]) throws Exception > { > System.out.println("\n Territory no_NO"); > Class.forName("org.apache.derby.jdbc.EmbeddedDriver"); > Connection conn = DriverManager.getConnection("jdbc:derby:nordb;create=true;territory=no_NO;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Territory en_US"); > conn = DriverManager.getConnection("jdbc:derby:endb;create=true;territory=en_US;collation=TERRITORY_BASED"); > testLikeWithHighestValidCharacter(conn); > conn.close(); > System.out.println("\n Collation USC_BASIC"); > conn = DriverManager.getConnection("jdbc:derby:basicdb;create=true"); > testLikeWithHighestValidCharacter(conn); > } > public static void testLikeWithHighestValidCharacter(Connection conn) throws SQLException { > Statement stmt = conn.createStatement(); > try { > stmt.executeUpdate("drop table t1"); > }catch (SQLException se) > {// drop failure ok. > } > stmt.executeUpdate("create table t1(c11 int)"); > stmt.executeUpdate("insert into t1 values 1"); > > // \uFA2D - the highest valid character according to > // Character.isDefined() of JDK 1.4; > PreparedStatement ps = > conn.prepareStatement("select 1 from t1 where '\uFA2D' like ?"); > String[] match = { "%", "_", "\uFA2D" }; > for (int i = 0; i < match.length; i++) { > System.out.println("select 1 from t1 where '\\uFA2D' like " + match[i]); > ps.setString(1, match[i]); > ResultSet rs = ps.executeQuery(); > if( rs.next() && rs.getString(1).equals("1")) > System.out.println("PASS"); > else System.out.println("FAIL: no match"); > rs.close(); > } > } > } > Mamta made some comments on this issue in the following thread: > http://www.nabble.com/Single-character-does-not-match-high-value-unicode-character-with-collation-TERRITORY_BASED.-Is-this-a-bug-tf4118767.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.