Return-Path: Delivered-To: apmail-commons-user-archive@www.apache.org Received: (qmail 26783 invoked from network); 29 Oct 2009 16:21:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Oct 2009 16:21:38 -0000 Received: (qmail 46030 invoked by uid 500); 29 Oct 2009 16:21:37 -0000 Delivered-To: apmail-commons-user-archive@commons.apache.org Received: (qmail 45916 invoked by uid 500); 29 Oct 2009 16:21:37 -0000 Mailing-List: contact user-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Users List" Delivered-To: mailing list user@commons.apache.org Received: (qmail 45906 invoked by uid 99); 29 Oct 2009 16:21:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 16:21:37 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of scott.bradley.wilson@gmail.com designates 216.239.58.184 as permitted sender) Received: from [216.239.58.184] (HELO gv-out-0910.google.com) (216.239.58.184) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2009 16:21:27 +0000 Received: by gv-out-0910.google.com with SMTP id o2so328062gve.26 for ; Thu, 29 Oct 2009 09:21:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :content-type:mime-version:subject:date:x-mailer; bh=m55d/qNLGjwZ9NKHLEDNd3t6VEtYU/Z89BSygu8mKcI=; b=ZFfoWzrF0NRiOkHjz9sRuxJhPx/OWrzJpWUoi9J2AHGNXq9eHvesTVSppZKI4quBJm ywXVhsca1BvYm3GYD6ykHTuTMOTmKv/2P25iuPspwg2OLA/aDJ0qKpiuk/xOUsg/ige4 trTcW1HE4KoeRWFxY3fcLTGxXXt9z9hMH4n3c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:content-type:mime-version:subject:date:x-mailer; b=HY6uud9SBzXrE/BcV6kMb8twoPwoC1LGZFQOU6A3Un2JFFCO7xsg/ocr9jUjmvUCeh WIXWKu+hUOKOZFstWI8e/Uhwn/Mmmer7QGNVUJg4oW5wajBmkvqnMZ6FX+J0gDfm+U3s NJ6YUohZBndkSTw9l82cOjFn4gm31mELBTdpE= Received: by 10.102.12.1 with SMTP id 1mr111306mul.63.1256833266782; Thu, 29 Oct 2009 09:21:06 -0700 (PDT) Received: from ?192.168.160.215? ([193.63.48.246]) by mx.google.com with ESMTPS id j2sm190507mue.16.2009.10.29.09.21.05 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 29 Oct 2009 09:21:05 -0700 (PDT) Message-Id: <4A5D5001-6BA4-4F98-B8B9-4FD02A22D75D@gmail.com> From: Scott Wilson To: user@commons.apache.org Content-Type: multipart/signed; boundary=Apple-Mail-20--646809561; micalg=sha1; protocol="application/pkcs7-signature" Mime-Version: 1.0 (Apple Message framework v936) Subject: collapsing unicode white space Date: Thu, 29 Oct 2009 16:21:04 +0000 X-Mailer: Apple Mail (2.936) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail-20--646809561 Content-Type: text/plain; charset=WINDOWS-1252; format=flowed; delsp=yes Content-Transfer-Encoding: quoted-printable Hi everyone, I need to implement a W3C processing algorithm which states: 10.1.8 Rule for Getting Text Content with Normalized White Space The rule for getting text content with normalized white space is given =20= in the following algorithm. The algorithm always returns a string, =20 which MAY be empty. =95 Let input be the Element to be processed. =95 Let result be the result of applying the rule for getting = text =20 content to input. =95 In result, convert any sequence of one or more Unicode white = space =20 characters into a single U+0020 SPACE. =95 Return result. The step I'm having problems with is "convert any sequence of one or =20 more Unicode white space characters into a single U+0020 SPACE." The StringUtils replace() and CharSetUtils squeeze() methods would =20 seem to be best suited for solving this one, but there doesn't seem to =20= be a set syntax for easily specifying unicode white space chars =20 defined for one thing. Has anyone else solved a similar problem using commons lang, or should =20= I consider using something else? Thanks! S /-/-/-/-/-/ Scott Wilson Apache Wookie: http://incubator.apache.org/projects/wookie.html --Apple-Mail-20--646809561 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIGPTCCAvYw ggJfoAMCAQICEFfBV0Fkm5fRt8WN2nL9rEkwDQYJKoZIhvcNAQEFBQAwYjELMAkGA1UEBhMCWkEx JTAjBgNVBAoTHFRoYXd0ZSBDb25zdWx0aW5nIChQdHkpIEx0ZC4xLDAqBgNVBAMTI1RoYXd0ZSBQ ZXJzb25hbCBGcmVlbWFpbCBJc3N1aW5nIENBMB4XDTA5MDIxMDEwMzc0MVoXDTEwMDIxMDEwMzc0 MVowUDEfMB0GA1UEAxMWVGhhd3RlIEZyZWVtYWlsIE1lbWJlcjEtMCsGCSqGSIb3DQEJARYec2Nv dHQuYnJhZGxleS53aWxzb25AZ21haWwuY29tMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKC AQEApmPOA9Zuq/WR+VtAWxnAm8jdnzJtQ4MJnn+kUFuxmdaved4fseCQnDFUH7yWDL1NWQBRyFOw BQs4aX+FyCyUZLnEKGnLSqyRlBtY5aDaljuXu23+SYcxWj7HfhwCuFxKH2aX3/XZBGs2SYnFxMBz QFGN7z+6BMVYOg0naKINhWszG+ffbCJtIWohm5udz5jLMVUZY8RZNkOEhYguihGlhxfTMxU2EqWz aVb6SQDn9rFu2RGueKcCeZ8P6iMQPasOa16XO9wJKMdqyWXQLjRnO0coZs3HnG1sFE2OSrLcGPbQ /hS7AHKZjRbK2sO8VY8HPTgSnDP+XCaCwLJhPYNu+QIDAQABozswOTApBgNVHREEIjAggR5zY290 dC5icmFkbGV5LndpbHNvbkBnbWFpbC5jb20wDAYDVR0TAQH/BAIwADANBgkqhkiG9w0BAQUFAAOB gQBKVC8AiUzEUsSl+682ZNUG+bnQ6w88AKEWnV9ZAQ3adgI/9QI1jWJOQDKyI85EXEPfMHUqWaam ghH13Z2FPqofe0pz0NrUoqNKvHTacq6buygEUS2BBsv2v6Lxu3X303go6VuXatk0KkZSJo6zD83x mE2XFtzKT9FBao5TDmAYbzCCAz8wggKooAMCAQICAQ0wDQYJKoZIhvcNAQEFBQAwgdExCzAJBgNV BAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUgVG93bjEaMBgGA1UE ChMRVGhhd3RlIENvbnN1bHRpbmcxKDAmBgNVBAsTH0NlcnRpZmljYXRpb24gU2VydmljZXMgRGl2 aXNpb24xJDAiBgNVBAMTG1RoYXd0ZSBQZXJzb25hbCBGcmVlbWFpbCBDQTErMCkGCSqGSIb3DQEJ ARYccGVyc29uYWwtZnJlZW1haWxAdGhhd3RlLmNvbTAeFw0wMzA3MTcwMDAwMDBaFw0xMzA3MTYy MzU5NTlaMGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5KSBM dGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQTCBnzANBgkq hkiG9w0BAQEFAAOBjQAwgYkCgYEAxKY8VXNV+065yplaHmjAdQRwnd/p/6Me7L3N9VvyGna9fww6 YfK/Uc4B1OVQCjDXAmNaLIkVcI7dyfArhVqqP3FWy688Cwfn8R+RNiQqE88r1fOCdz0Dviv+uxg+ B79AgAJk16emu59l0cUqVIUPSAR/p7bRPGEEQB5kGXJgt/sCAwEAAaOBlDCBkTASBgNVHRMBAf8E CDAGAQH/AgEAMEMGA1UdHwQ8MDowOKA2oDSGMmh0dHA6Ly9jcmwudGhhd3RlLmNvbS9UaGF3dGVQ ZXJzb25hbEZyZWVtYWlsQ0EuY3JsMAsGA1UdDwQEAwIBBjApBgNVHREEIjAgpB4wHDEaMBgGA1UE AxMRUHJpdmF0ZUxhYmVsMi0xMzgwDQYJKoZIhvcNAQEFBQADgYEASIzRUIPqCy7MDaNmrGcPf6+s vsIXoUOWlJ1/TCG4+DYfqi2fNi/A9BxQIJNwPP2t4WFiw9k6GX6EsZkbAMUaC4J0niVQlGLH2ydx VyWN3amcOY6MIE9lX5Xa9/eH1sYITq726jTlEBpbNU1341YheILcIRk13iSx0x1G/11fZU8xggMQ MIIDDAIBATB2MGIxCzAJBgNVBAYTAlpBMSUwIwYDVQQKExxUaGF3dGUgQ29uc3VsdGluZyAoUHR5 KSBMdGQuMSwwKgYDVQQDEyNUaGF3dGUgUGVyc29uYWwgRnJlZW1haWwgSXNzdWluZyBDQQIQV8FX QWSbl9G3xY3acv2sSTAJBgUrDgMCGgUAoIIBbzAYBgkqhkiG9w0BCQMxCwYJKoZIhvcNAQcBMBwG CSqGSIb3DQEJBTEPFw0wOTEwMjkxNjIxMDVaMCMGCSqGSIb3DQEJBDEWBBT/uXI0jzAyUjphikDm 69VtIFhYbzCBhQYJKwYBBAGCNxAEMXgwdjBiMQswCQYDVQQGEwJaQTElMCMGA1UEChMcVGhhd3Rl IENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMjVGhhd3RlIFBlcnNvbmFsIEZyZWVtYWls IElzc3VpbmcgQ0ECEFfBV0Fkm5fRt8WN2nL9rEkwgYcGCyqGSIb3DQEJEAILMXigdjBiMQswCQYD VQQGEwJaQTElMCMGA1UEChMcVGhhd3RlIENvbnN1bHRpbmcgKFB0eSkgTHRkLjEsMCoGA1UEAxMj VGhhd3RlIFBlcnNvbmFsIEZyZWVtYWlsIElzc3VpbmcgQ0ECEFfBV0Fkm5fRt8WN2nL9rEkwDQYJ KoZIhvcNAQEBBQAEggEAa1Slst3+FaRPSwjzMUEwdnu2+tgNAVmXloE1wnIsy8WSd09ewEuVldxR dxEDs/IlvQmnXR6WO3rf3L6IZNSUYvyoDSaeeMxSeVI9chpM1BgeYMNAG/x09yBRgRx+qT8WVfs6 BQYAOfyWIqMa+PCbhbtGEJ3mb6BwzhHiKZ9ewz9sLEsU3pIHUCHjoBHJDz9fBOZhuEgLSsozyjBH xypt1fUED4HjTcpX0BhPE0V0df1C/5P82skCHpSfjR+VQNVYq2zCNO6TYOr1rYmvKbbeg8VTJRJL wluGezjVj3yuUrHgO4dUK/CzRQvTZ5zJTyUILdPXd7Bro4zDq2gqX4G6jgAAAAAAAA== --Apple-Mail-20--646809561--