From dev-return-7168-apmail-stdcxx-dev-archive=stdcxx.apache.org@stdcxx.apache.org Fri Mar 28 00:11:11 2008 Return-Path: Delivered-To: apmail-stdcxx-dev-archive@www.apache.org Received: (qmail 50442 invoked from network); 28 Mar 2008 00:11:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 28 Mar 2008 00:11:11 -0000 Received: (qmail 83696 invoked by uid 500); 28 Mar 2008 00:11:10 -0000 Delivered-To: apmail-stdcxx-dev-archive@stdcxx.apache.org Received: (qmail 83675 invoked by uid 500); 28 Mar 2008 00:11:10 -0000 Mailing-List: contact dev-help@stdcxx.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@stdcxx.apache.org Delivered-To: mailing list dev@stdcxx.apache.org Received: (qmail 83666 invoked by uid 99); 28 Mar 2008 00:11:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 Mar 2008 17:11:10 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of msebor@gmail.com designates 64.233.182.189 as permitted sender) Received: from [64.233.182.189] (HELO nf-out-0910.google.com) (64.233.182.189) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 28 Mar 2008 00:10:29 +0000 Received: by nf-out-0910.google.com with SMTP id 30so2122046nfu.43 for ; Thu, 27 Mar 2008 17:10:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:organization:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; bh=gHvBZsxC/KY08EaBTDKxZMTc4Awe8p+vntx2Cs4k9fw=; b=iJLjzVHOc9ZQHyJx0AZC48Ylp/i6w/0DlY0dol8Q8ezxSczh55Zu7rsuFmPigm6K51s9EfkArdJIAbU6WS45qbPfLRhL+gcGZU9gPAulghhntCZ9RVaZ7VYbOMvLDg3aRczh7obj4O0C6qPEF2xRjD8tJGAfacICNatQfEZ8R9w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=message-id:date:from:organization:user-agent:mime-version:to:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=CwIZMzxUatTSD1jzzmVoM6IWc/XIF1NtgbiohqtEFPGba+zIcmNdVssDEcQMAou3Dlu/zCNmt2JD9gSLBj2n1R6IBl/+ZiwEkcNYGkDLVW3+QoXcLA+6xueQbls2bMlZIDx/EzDiwu+PsifICt540IecW364uxeCGHpPkpNj9tI= Received: by 10.78.81.20 with SMTP id e20mr6222744hub.64.1206663039003; Thu, 27 Mar 2008 17:10:39 -0700 (PDT) Received: from localhost.localdomain ( [71.229.200.170]) by mx.google.com with ESMTPS id j10sm1601715mue.14.2008.03.27.17.10.36 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 27 Mar 2008 17:10:37 -0700 (PDT) Message-ID: <47EC377A.3060308@roguewave.com> Date: Thu, 27 Mar 2008 18:10:34 -0600 From: Martin Sebor Organization: Rogue Wave Software, Inc. User-Agent: Thunderbird 2.0.0.12 (X11/20080226) MIME-Version: 1.0 To: dev@stdcxx.apache.org Subject: Re: [Stdcxx Wiki] Update of "LocaleLookup" by TravisVitek References: <20080326220304.410.29885@eos.apache.org> <47EBB480.100@roguewave.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Martin Sebor X-Virus-Checked: Checked by ClamAV on apache.org Travis Vitek wrote: > > > Travis Vitek wrote: >> Martin Sebor wrote: >> >>> Travis Vitek wrote: >>> >>>> + || 22.LOCALE.CONS.MT.CPP || *1,+ || >>>> + || 22.LOCALE.CTYPE.CPP || *2 || >>>> + || 22.LOCALE.CTYPE.IS.CPP || *2 || >>>> + || 22.LOCALE.CTYPE.MT.CPP || *1,+ || >>>> + || 22.LOCALE.CTYPE.NARROW.CPP || *2 || >>>> + || 22.LOCALE.CTYPE.SCAN.CPP || *2 || >>>> + || 22.LOCALE.CTYPE.TOLOWER.CPP || *2 || >>>> + || 22.LOCALE.CTYPE.TOUPPER.CPP || *2 || >>> I thought the ctype tests were being run in all installed locales, >>> just like the numpunct one? Which is what we want to move away from. >>> IMO, exercising a small set (less than a dozen) of known locales and >>> encodings should be plenty. >>> >> Yes, the non-mt ctype tests iterate over each locale for which >> the function call `setlocale (LC_CTYPE, name)' succeeds. The mt >> ctype tests all limit the number of tested locales to 32. >> > > Any suggestions on which languages/countries/codesets that we should > be testing against for the ctype tests? I think we should cover a few Western locales and a few Asian ones. For the first group, here are some candidates: one of each of en_US, de_*, fr_*, es_*, in a mix of ISO-8859 and UTF-8. For the second group, I'd consider one of each of ja_JP, ru_*, zh_* in EUC-JP, Shift_JIS, KOI*, GB*, and UTF-8. > > Reducing the number of selected locales to 32 is pretty easy. Selecting > which locales is a little more difficult. You're telling me! :) > Another issue is that the > mechanism I have defined doesn't support selecting only one locale for > each match. So there's no way to ask for just one of locale out of the three here: ja_JP.{EUC-JP,Shift_JIS,UTF-8} That may not be too much of a problem unless each of the expansions matches multiple aliases of the same locale. Will see how it goes as we come up with query strings for each test. Martin