Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 65389200BCB for ; Thu, 24 Nov 2016 20:31:55 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 63CB9160B1E; Thu, 24 Nov 2016 19:31:55 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 6373E160AFB for ; Thu, 24 Nov 2016 20:31:54 +0100 (CET) Received: (qmail 678 invoked by uid 500); 24 Nov 2016 19:31:53 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 666 invoked by uid 99); 24 Nov 2016 19:31:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 24 Nov 2016 19:31:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id AD9A61A032C for ; Thu, 24 Nov 2016 19:31:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.679 X-Spam-Level: X-Spam-Status: No, score=0.679 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_REPLY=1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 9rUSsMcaQ308 for ; Thu, 24 Nov 2016 19:31:50 +0000 (UTC) Received: from mail-qt0-f195.google.com (mail-qt0-f195.google.com [209.85.216.195]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id AEF745F5C4 for ; Thu, 24 Nov 2016 19:31:49 +0000 (UTC) Received: by mail-qt0-f195.google.com with SMTP id l20so3031461qta.1 for ; Thu, 24 Nov 2016 11:31:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:content-transfer-encoding:mime-version:subject:date:references :to:in-reply-to:message-id; bh=W0+t8dAZYzzOOmoqWmW8PurWuqk/t76mbrkb6ftvFZk=; b=j/2Z0ElOcrgdxEYAlSLD6WBmnWLUrCzoL7xWzoz5Et5MBnVfwmx0et3VmEnwjGxtnF sk9Z4UT/lgkzK6OolYoM+cx9zea5J+br3fOgELdyOmWtGEh2FKnm8/UXiwoljmoUD6Gm cxcFj3N27DwVSW0RRvu2P07UPcvXahx/WAXN3qWETt8a7vgtpkp6aWxWzD6pmomQj1mD qgIMpo/nicGwxTcX820/uRuabjmuZls3n2HBlgeTEt+lFS+Gj0RYBf/UZbGvs3fWVujd mGvNQBf2cJd3tlmYGvAeGshaqE9xs4w6sUhYpXozx2/mLhWAaX0yd+/XUNckV4CJsFVd P+mQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:content-transfer-encoding:mime-version :subject:date:references:to:in-reply-to:message-id; bh=W0+t8dAZYzzOOmoqWmW8PurWuqk/t76mbrkb6ftvFZk=; b=kxaUb2sbbJKvWzpQPn2fZlZDYtzVEvUKIPHWkJ/icNIVRRHfx5jtnrCz1I49V6uZm+ lSNka7U/4ffouiKV37kA1PpNnMdqDWm3O8tcvVvkJqiWjLjpYc/GiXYv85SBC+G1C5wd Jw36axoYgQLKjNw5gOuJgPSHLKzogxSmy47zPnXyS2/Ik8p6GRENeXrb19jDU+OTUaBQ MMKn66SXg8jLzYB4/uyw7S3Th18UbqEAwIEJDBr0gjI12eCqMDvYJki+XKYL7S9ngQYW mZfsNt+US4Mq5WszuUSb2hIHGSNyi08KHN7kajSBxKWfGeNX9a+kDFfAMRBGy8k+1uOx EPDg== X-Gm-Message-State: AKaTC02L+IBMqYRGUurYTaxeQafkTgoO4XvRv1nbSZT0GRv1XFWG1M43JQriqAeLwFdTdQ== X-Received: by 10.237.50.163 with SMTP id z32mr3765766qtd.71.1480015903198; Thu, 24 Nov 2016 11:31:43 -0800 (PST) Received: from ?IPv6:2601:5cc:4404:f980:94cf:4f2f:e952:9be8? ([2601:5cc:4404:f980:94cf:4f2f:e952:9be8]) by smtp.gmail.com with ESMTPSA id 5sm19627799qts.47.2016.11.24.11.31.42 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 24 Nov 2016 11:31:42 -0800 (PST) From: Rob Tompkins Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 10.1 \(3251\)) Subject: Re: [text][lang] string escaping Date: Thu, 24 Nov 2016 14:31:40 -0500 References: <78860ad1cf7900175e8e3a40028d26ec@scarlet.be> To: Commons Developers List In-Reply-To: Message-Id: <0A7F4B21-DD71-4DF3-9D25-7D3BC563101A@gmail.com> X-Mailer: Apple Mail (2.3251) archived-at: Thu, 24 Nov 2016 19:31:55 -0000 > On Nov 24, 2016, at 1:58 PM, Gary Gregory = wrote: >=20 > I just found that longestCommonPrefix does not exist in [lang]... = let's > sort through what we have first before we imagine ourselves more = headaches > ;-) >=20 Fair, I more meant to pose an interesting example of where I think the = boundary between text and lang is, and I=E2=80=99m completely with you = that it seems squarely on the boundary. With that in mind, what are the thoughts on the original question now = that we=E2=80=99ve explored where we think the boundary is? On one hand = the mechanics behind string escaping seem to lie completely in text, but = on the other the use case of it seem to lie more in lang. Cheers, -Rob > Gary >=20 > On Thu, Nov 24, 2016 at 10:55 AM, Gary Gregory = > wrote: >=20 >>=20 >>=20 >> On Thu, Nov 24, 2016 at 5:54 AM, Rob Tompkins = wrote: >>=20 >>>=20 >>>> On Nov 22, 2016, at 4:38 PM, Gary Gregory >>> wrote: >>>>=20 >>>> On Tue, Nov 22, 2016 at 12:04 PM, Benedikt Ritter = >>>> wrote: >>>>=20 >>>>> Hello Gilles >>>>>=20 >>>>> Gilles schrieb am Di., 22. Nov. = 2016 um >>>>> 20:55 Uhr: >>>>>=20 >>>>>> On Tue, 22 Nov 2016 19:40:30 +0000, Benedikt Ritter wrote: >>>>>>> Gary Gregory schrieb am Sa., 19. Nov. = 2016 >>>>>>> um >>>>>>> 19:09 Uhr: >>>>>>>=20 >>>>>>>> On Nov 19, 2016 9:50 AM, "Gilles" = >>>>>>>> wrote: >>>>>>>>>=20 >>>>>>>>> On Sat, 19 Nov 2016 08:59:50 -0800, Gary Gregory wrote: >>>>>>>>>>=20 >>>>>>>>>> On Sat, Nov 19, 2016 at 3:33 AM, Benedikt Ritter >>>>>>>> >>>>>>>> wrote: >>>>>>>>>>=20 >>>>>>>>>>> Hello Gray, >>>>>>>>>>>=20 >>>>>>>>>>> Gary Gregory schrieb am Sa., 19. = Nov. >>>>>>>> 2016 um >>>>>>>>>>> 01:07 Uhr: >>>>>>>>>>>=20 >>>>>>>>>>>> Just a thought: >>>>>>>>>>>>=20 >>>>>>>>>>>> Does all the current (and future) string escaping code = (XML, >>>>>>>> HTML, >>>>>>>> ...) >>>>>>>>>>>> really belong in [lang]? Would it be more natural to have = it >>>>>>>> in >>>>>>>> [text]? >>>>>>>>>>>>=20 >>>>>>>>>>>=20 >>>>>>>>>>> My view on the whole think currently is, that we put stuff = that >>>>>>>> is >>>>>>>> related >>>>>>>>>>> to strings in Lang. Code that works on texts should go to = Text. >>>>>>>> To me a >>>>>>>>>>> text is more than just a string. A text contains works, that >>>>>>>> make up >>>>>>>>>>> sentences, which in turn build paragraphs. >>>>>>>>>>>=20 >>>>>>>>>>> Using this description, I'd argue that escaping belongs into >>>>>>>> lang and >>>>>>>> not >>>>>>>>>>> into text, because it works on individual characters rather = than >>>>>>>> on >>>>>>>> texts. >>>>>>>>>>>=20 >>>>>>>>>>> But this would also raise the question if the various edit >>>>>>>> distance >>>>>>>>>>> algorithms works on texts or on strings. So maybe my = distinction >>>>>>>> is not >>>>>>>>>>> good at all. >>>>>>>>>>>=20 >>>>>>>>>>> Do we need to better specify the scope of text? >>>>>>>>>>>=20 >>>>>>>>>>=20 >>>>>>>>>> Great question of course. >>>>>>>>>>=20 >>>>>>>>>> I'd like to think of [lang] as "What is missing from the = JRE's >>>>>>>> most >>>>>>>> basic >>>>>>>>>> classes and specifically from the java.lang package and some >>>>>>>> java.util >>>>>>>>>> classes". >>>>>>>>>>=20 >>>>>>>>>> Quoting from our site: >>>>>>>>>>=20 >>>>>>>>>> "The standard Java libraries fail to provide enough methods = for >>>>>>>>>> manipulation of its core classes. Apache Commons Lang = provides >>>>>>>> these >>>>>>>> extra >>>>>>>>>> methods. >>>>>>>>>> Lang provides a host of helper utilities for the java.lang = API, >>>>>>>> notably >>>>>>>>>> String manipulation methods, basic numerical methods, object >>>>>>>> reflection, >>>>>>>>>> concurrency, creation and serialization and System = properties. >>>>>>>> Additionally >>>>>>>>>> it contains basic enhancements to java.util.Date >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> How about "Date" becoming a nice standalone component? ;-) >>>>>>>>> [Components should be concept-based.] >>>>>>>>=20 >>>>>>>> Joda-time covers more than we will ever do here IMO. And Java 8 = has >>>>>>>> new >>>>>>>> time APIs... maybe when lang is Java 8 based we can look again. = For >>>>>>>> now I'd >>>>>>>> rather leave dates as is. >>>>>>>>=20 >>>>>>>=20 >>>>>>> Yes, let's get back to topic. >>>>>>>=20 >>>>>>> I think we need a clear distinction between string related stuff = that >>>>>>> goes >>>>>>> into Lang and more complex stuff that goes into text. >>>>>>=20 >>>>>> IMHO "more complex" is key, not so much "string" vs "text". >>>>>>=20 >>>>>> Hence I suggest [text] is a better place for "RandomStringUtils" >>>>>> than [lang], and the former should allow dependencies as a >>>>>> foundation for that complexity; in that case, that would be >>>>>> "Commons RNG". >>>>>>=20 >>>>>=20 >>>>> I find it hard to draw a line here. What might be complex to me, = could >>> be >>>>> simple for others. I fear that there will always be discussions. >>>>>=20 >>>>=20 >>>> Then we can focus on a feature request with a different lens: Would = you >>>> reasonably expect this to be in java.lang or java.util. >>>=20 >>> Let=E2=80=99s consider an example that I would ask about here. = Consider the >>> =E2=80=9ClongestCommonPrefix=E2=80=9D method: >>>=20 >>> public static int longestCommonPrefix(final String s1, final String = s2) { >>> int i =3D 0; >>> while (i < s1.length() && i < s2.length() && s1.charAt(i) =3D=3D >>> s2.charAt(i)) { >>> i++; >>> } >>> return i; >>> } >>> I would think that this would end up in text because I doubt many = folks >>> would use this in standard application code. What are other=E2=80=99s = thoughts? >>>=20 >>=20 >> This is 50/50 in my mind because there is no "domain" like words, >> sentences, and so on. This is really about churning through a string = for >> some condition. >>=20 >> That said, moving it to [text] could be an opportunity to change the = API >> name. Why is it longestCommonPrefix and not just commonPrefix? What = would >> a shortestCommonPrefix is what I think when I see "longest". >>=20 >> Gary >>=20 >>=20 >>>=20 >>> -Rob >>>=20 >>>>=20 >>>> Gary >>>>=20 >>>>>=20 >>>>>=20 >>>>>>=20 >>>>>> Regards, >>>>>> Gilles >>>>>>=20 >>>>>>>=20 >>>>>>>=20 >>>>>>>>=20 >>>>>>>> Gary >>>>>>>>>=20 >>>>>>>>> How about deprecating "RandomUtils"? >>>>>>>>> [(About to be) superseded by "Commons RNG".] >>>>>>>>>=20 >>>>>>>>> How about to >>>>>>>>> * moving "RandomStringUtils" to [text] too and >>>>>>>>> * implement it against a custom interface (as per Jochen's >>>>>>>> remark) >>>>>>>>> rather than "java.util.Random" >>>>>>>>> ? >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>> and a series of utilities >>>>>>>>>> dedicated to help with building methods, such as hashCode, >>>>>>>> toString and >>>>>>>>>> equals." >>>>>>>>>>=20 >>>>>>>>>> I do not think edit distances fit into this at all. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> +1 >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>> I am also questioning whether string escaping belongs in lang = as >>>>>>>> well >>>>>>>> since >>>>>>>>>> there are so many escaping domains XML, HTML, JSON, and so = on. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> They don't belong. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>>> IMO, anything that is word based does not belong in lang like >>>>>>>>>> capitlization. The WordUtils class should be in [text] IMO. = The >>>>>>>> whole >>>>>>>> lang >>>>>>>>>> text package should be in [text] IMO. >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> +1 >>>>>>>>> [To anything that imposes a strict diet on the humongous >>>>>>>> "components".] >>>>>>>>>=20 >>>>>>>>>=20 >>>>>>>>> Regards, >>>>>>>>> Gilles >>>>>>>>>=20 >>>>>>=20 >>>>>>=20 >>>>>> = --------------------------------------------------------------------- >>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>>>>> For additional commands, e-mail: dev-help@commons.apache.org >>>>>>=20 >>>>>>=20 >>>>>=20 >>>>=20 >>>>=20 >>>>=20 >>>> -- >>>> E-Mail: garydgregory@gmail.com | ggregory@apache.org >>>> Java Persistence with Hibernate, Second Edition >>>> >> ie=3DUTF8&camp=3D1789&creative=3D9325&creativeASIN=3D1617290459&link >>> Code=3Das2&tag=3Dgarygregory-20&linkId=3Dcadb800f39946ec62ea2b1af9fe6a= 2b8> >>>>=20 >>>> >> am2&o=3D1&a=3D1617290459> >>>> JUnit in Action, Second Edition >>>> >> ie=3DUTF8&camp=3D1789&creative=3D9325&creativeASIN=3D1935182021&link >>> = Code=3Das2&tag=3Dgarygregory-20&linkId=3D31ecd1f6b6d1eaf8886ac902a24de418%= 22> >>>>=20 >>>> >> am2&o=3D1&a=3D1935182021> >>>> Spring Batch in Action >>>> >> ie=3DUTF8&camp=3D1789&creative=3D9325&creativeASIN=3D1935182951&link >>> Code=3D%7B%7BlinkCode%7D%7D&tag=3Dgarygregory-20&linkId=3D%7B%7Bli >>> nk_id%7D%7D%22%3ESpring+Batch+in+Action> >>>> >> am2&o=3D1&a=3D1935182951> >>>> Blog: http://garygregory.wordpress.com >>>> Home: http://garygregory.com/ >>>> Tweet! http://twitter.com/GaryGregory >>>=20 >>>=20 >>=20 >>=20 >> -- >> E-Mail: garydgregory@gmail.com | ggregory@apache.org >> Java Persistence with Hibernate, Second Edition >> = >>=20 >> = >> JUnit in Action, Second Edition >> = >>=20 >> = >> Spring Batch in Action >> = >> = >> Blog: http://garygregory.wordpress.com >> Home: http://garygregory.com/ >> Tweet! http://twitter.com/GaryGregory >>=20 >=20 >=20 >=20 > --=20 > E-Mail: garydgregory@gmail.com | ggregory@apache.org > Java Persistence with Hibernate, Second Edition > = >=20 > = > JUnit in Action, Second Edition > = >=20 > = > Spring Batch in Action > = > = > Blog: http://garygregory.wordpress.com > Home: http://garygregory.com/ > Tweet! http://twitter.com/GaryGregory --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org