harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexei Zakharov" <alexei.zakha...@gmail.com>
Subject Re: [classlib][text] regression in text module, a non-bug difference?
Date Wed, 20 Feb 2008 12:33:42 GMT
¡Buenos dìas!

:) No, I'm not an expert in Spanish. But after reading your post I got
an impression that we have support for additional variant of Spanish
language comparing to RI. However, I've tried to find something about
traditional Spanish variant in ICU locale browser and found nothing. I
believe we should learn more about this problem before making any
decision.

Regards,
Alexei

2008/2/19, Tony Wu <wuyuehao@gmail.com>:
> Hi, all
>
> I'm investigating the regression[1] in text module. Actually these 5
> failures come down to one reason: the support of traditional Spanish
> charactor "ch". Following is my understanding.
>
> My fix for HARMONY-5465 makes the Locale.toString be compatible with
> RI. Before my commit, the toString() of the Locale with empty "contry"
> field has only one underscore in the output but RI has two. For
> instance, new Locale("es","","TRADITIONAL").toString() returns
> "es_TRADITIONAL" in Harmony whereas "es__TRADITIONAL" in RI. Something
> interesting, ICU makes use of the output of toString() as keyword to
> indicate its Locale instance. That is to say, the 5 testcases passes
> before because they have not been tested in real traditional Spanish
> locale so that the character "ch" was interpreted as two separate
> characters "c" and "h". That is why we can set the offset to 1 in our
> testcases. After my commit, ICU find the right Spanish locale so that
> its behavior is compatible with spec[2].
>
> One thing strange is that I can not get the traditional Spanish locale
> in RI. RI behaves the same no mater whether there is a variant
> "TRADITIONAL" or not. Spec does not say anything about the
> "traditional", but I googled to know that from 1998 the character "ch"
> has been cancelled in Spanish. I suppose that RI changed the behavior
> of Spanish locale but forgot to modify the spec accordingly.
>
> BTW for the normal Spanish Locale(new Locale("es","ES")), we have the
> same behavior with RI. Seems ICU supports the traditional Spanish in
> the form of new Locale("es","","TRADITIONAL") but RI does not. Run
> testcase below[3] on RI to show the differences.
>
> Is there any expert familiar with Spanish here? Neey your advice.
>
> [1]
> http://people.apache.org/~smishura/r628209/Windows_x86/classlib-test/
>
> [2]
> spec says,
> For example, consider the following in Spanish:
>
>  "ca" -> the first key is key('c') and second key is key('a').
>  "cha" -> the first key is key('ch') and second key is key('a').
>
>
> [3]
>         RuleBasedCollator rbColl = (RuleBasedCollator) Collator
>                 .getInstance(new Locale("es", "", "TRADITIONAL"));
>         String text = "cha";
>         CollationElementIterator iterator = rbColl
>                 .getCollationElementIterator(text);
>         int keyNum = 0;
>         while (iterator.next() != -1) {
>             keyNum++;
>         }
>         System.out.println("RI has " + keyNum + " keys");
>
>         com.ibm.icu.text.RuleBasedCollator r =
> (com.ibm.icu.text.RuleBasedCollator) com.ibm.icu.text.Collator
>                 .getInstance(new Locale("es", "", "TRADITIONAL"));
>         com.ibm.icu.text.CollationElementIterator it = r
>                 .getCollationElementIterator(text);
>         keyNum = 0;
>         while (it.next() != -1) {
>             keyNum++;
>         }
>         System.out.println("ICU has " + keyNum + " keys");
>
>
>
> The output is:
> RI has 3 keys
> ICU has 2 keys
>
>
> --
> Tony Wu
> China Software Development Lab, IBM
>

Mime
View raw message