Return-Path: Delivered-To: apmail-harmony-dev-archive@www.apache.org Received: (qmail 69545 invoked from network); 11 Jul 2008 02:50:58 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 11 Jul 2008 02:50:58 -0000 Received: (qmail 99947 invoked by uid 500); 11 Jul 2008 02:50:58 -0000 Delivered-To: apmail-harmony-dev-archive@harmony.apache.org Received: (qmail 99891 invoked by uid 500); 11 Jul 2008 02:50:57 -0000 Mailing-List: contact dev-help@harmony.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@harmony.apache.org Delivered-To: mailing list dev@harmony.apache.org Received: (qmail 99871 invoked by uid 99); 11 Jul 2008 02:50:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jul 2008 19:50:57 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nbeyer@gmail.com designates 64.233.182.186 as permitted sender) Received: from [64.233.182.186] (HELO nf-out-0910.google.com) (64.233.182.186) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 11 Jul 2008 02:50:06 +0000 Received: by nf-out-0910.google.com with SMTP id c7so1434502nfi.40 for ; Thu, 10 Jul 2008 19:50:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:sender :to:subject:in-reply-to:mime-version:content-type:references :x-google-sender-auth; bh=4yemFE6uja7oue59DDlOjtJsooq0I2U9jU3EyrC4YmY=; b=DCKc+rENNMWIAQ1bEvgMaEjLNHHQQnw0W3opdjNSrjtuin/1yTua5O2+l+4KWoRi0L 61jTW7eFhOokNRYOc7aqIWfeJVssA5gib2JSZ4lzzBHE6en5eEKWSIu4OFzeUz828yie kkkZlrXlONnWVhYXFahfZhqDSwGHNKGQsOw6w= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:in-reply-to:mime-version :content-type:references:x-google-sender-auth; b=JAXyuB051LOt/Ol8M39wt79seeUHtYdekES2hPWflFLAr7ME/3Kyspcw4wEDwIUfs4 1DI4Oibrm8ebtmqHLtMFBIaNwtrxSUcxRr2ZjShoy/DEW+L0cigA8iWGbfdHnRMdDmNp jf/V1CuK3MaLejUiw0P0MZCUYSTaR7sdi8eH8= Received: by 10.125.125.5 with SMTP id c5mr2648222mkn.142.1215744627645; Thu, 10 Jul 2008 19:50:27 -0700 (PDT) Received: by 10.125.38.19 with HTTP; Thu, 10 Jul 2008 19:50:27 -0700 (PDT) Message-ID: <3b3f27c60807101950t151d9a18nb24809dff4ada502@mail.gmail.com> Date: Thu, 10 Jul 2008 21:50:27 -0500 From: "Nathan Beyer" Sender: nbeyer@gmail.com To: dev@harmony.apache.org Subject: Re: [classlib][niochar] charset providers In-Reply-To: <4875D8AF.2060409@gmail.com> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_8913_11711755.1215744627606" References: <4875D8AF.2060409@gmail.com> X-Google-Sender-Auth: 2322e92f7a8e82fd X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_8913_11711755.1215744627606 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline There are probably a few others that we need to include in the 'standard' set that are relatively de facto. Off the top of my head, I'd say we need to include Cp1252/Windows-1252, as that's the default Windows OS charset. -Nathan On Thu, Jul 10, 2008 at 4:38 AM, Tim Ellison wrote: > I've been playing with nio_char to see if I can easily reduce the footprint > of a Harmony install, and make the charset data more modular. > > At the moment, that module build into nio_char.jar (1.32Mb) and > hycharset.dll (1.51Mb), and uses the ICU charset code in > icu4j-charsets-3_8.jar (2.36Mb). > > These are so big because a number of the charset encoders/decoders are data > driven rather than algorithmic. > > Q1: Why do we only specify the ICU providers in the services file? Seems > like we are not even using the Harmony providers at all. > > So I updated that file to prefer the Harmony providers. > > The providers use heuristics about when it makes sense to go into the > native code to do the encode/decode. So I've simply added a flag to see if > the natives are available, allowing me to remove the hycharset.dll and loose > no functionality when space is at a premium. > > Finally the Harmony charsets are split across standard and additional > providers, but they are all accessed from the same provider. > > Q2: What is the distinction used to classify some as standard and some as > additional? The spec requires only a small subset of charsets supported as > standard [2]. > > For now I just split the packaging into Harmony's standard/additional > charsets. > > > So I've ended up with: > nio_char.jar (155kb) > nio_char_add.jar (1.23Mb) optional > hycharset.dll (1.51Mb) optional > icu4j-charsets-3_8.jar (2.36Mb) optional > > Like I say, I'd have to restructure Harmony's definition of standard > charsets to comply with the spec since we have far more in there than the > spec requires. > > [1] > http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/nio_char/src/main/java/org/apache/harmony/niochar/java.nio.charset.spi.CharsetProvider?view=markup > [2] http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/Charset.html > > Regards, > Tim > ------=_Part_8913_11711755.1215744627606--