Return-Path: Delivered-To: apmail-commons-dev-archive@www.apache.org Received: (qmail 68823 invoked from network); 9 Oct 2010 13:14:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 9 Oct 2010 13:14:00 -0000 Received: (qmail 79617 invoked by uid 500); 9 Oct 2010 13:13:59 -0000 Delivered-To: apmail-commons-dev-archive@commons.apache.org Received: (qmail 79314 invoked by uid 500); 9 Oct 2010 13:13:58 -0000 Mailing-List: contact dev-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Commons Developers List" Delivered-To: mailing list dev@commons.apache.org Received: (qmail 79305 invoked by uid 99); 9 Oct 2010 13:13:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Oct 2010 13:13:57 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of sebbaz@gmail.com designates 209.85.216.171 as permitted sender) Received: from [209.85.216.171] (HELO mail-qy0-f171.google.com) (209.85.216.171) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 09 Oct 2010 13:13:52 +0000 Received: by qyk9 with SMTP id 9so888091qyk.9 for ; Sat, 09 Oct 2010 06:13:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=52CYJEW+kEC+y8l8wpseUDBRHw0jNL5x/pGAZTlDOd8=; b=Rfle9eR0UPQ3nu7wrfkSxiaZynF0EmAjCdKzi1vYYOb1NXHQ3JVLfJHOBzk2Rzywer ZPMPfXp3iD7yypYveQrEfaWqaX5zXI9Id6B6VwY203vC0R5S92cFWnaNgDdry4PVtYPB NXr57t8V37eIziygEZqRtMhi1zWARb5hRSqFI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=tZsDuEiQJcSTG7ar4/c8MwbcVKl3DgVMVv6N+ledypBzsRkN9gj6ZlDtkbXQQvtDEZ fweFHgfpmi1WlYAfHgE+FFBXeQ0BlH2xn7gJIpiOeAbXnJ874qi661Y8bUSmlFhpJscd 88xfkVXZwYKZAxh+zLzvWqCCj22uOMk7L55I0= MIME-Version: 1.0 Received: by 10.224.79.139 with SMTP id p11mr2654990qak.390.1286630011112; Sat, 09 Oct 2010 06:13:31 -0700 (PDT) Received: by 10.229.230.213 with HTTP; Sat, 9 Oct 2010 06:13:31 -0700 (PDT) In-Reply-To: References: Date: Sat, 9 Oct 2010 14:13:31 +0100 Message-ID: Subject: Re: [lang] Wildcard regex From: sebb To: Commons Developers List Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 9 October 2010 12:33, Stephen Colebourne wrote: > No, the wildcard is used for a database search. Find me all names > matching "Foo*". This is not for [io]. But where do humans get the idea that * and ? are wildcards? > > The oro code does look reasonable I'd say. Agreed, it could be adapted for Commons. However, whether the syntax it supports is universal I don't know. I think it's a useful addition, but the syntax needs to be carefully docume= nted. And it would be useful if there were versions for different OSes, to allow filename matching using the standard for that OS. > Stephen > > > On 9 October 2010 12:25, sebb wrote: >> What does the regex represent? A filename? >> >> If so, then maybe the code belongs in IO rather than Lang. >> >> Also, filename globbing is not consistent across OSes. >> >> On 8 October 2010 15:32, Stephen Colebourne wrote= : >>> Human users enter wildcards * and ? (because regex is too complex). In >>> my case, I'm passing it to MongoDB, which needs regex. >>> >>> Stephen >>> >>> >>> On 8 October 2010 15:10, Paul Benedict wrote: >>>> Can I get some sense of use case? What would you use it for? Just curi= ous. >>>> >>>> On Fri, Oct 8, 2010 at 9:06 AM, Stephen Colebourne wrote: >>>>> I don't think comons lang has a routine for converting a standard >>>>> wildcard string (with * and ?) to a regex. >>>>> Here is a first suggestion, although I'm sure it can be improved. >>>>> >>>>> =A0public Pattern createPattern(String text) { >>>>> =A0 =A0StringTokenizer tkn =3D new StringTokenizer(text, "?*", true); >>>>> =A0 =A0StringBuilder buf =3D new StringBuilder(text.length() + 10); >>>>> =A0 =A0buf.append('^'); >>>>> =A0 =A0boolean lastStar =3D false; >>>>> =A0 =A0while (tkn.hasMoreTokens()) { >>>>> =A0 =A0 =A0String str =3D tkn.nextToken(); >>>>> =A0 =A0 =A0if (str.equals("?")) { >>>>> =A0 =A0 =A0 =A0buf.append('.'); >>>>> =A0 =A0 =A0 =A0lastStar =3D false; >>>>> =A0 =A0 =A0} else if (str.equals("*")) { >>>>> =A0 =A0 =A0 =A0if (lastStar =3D=3D false) { >>>>> =A0 =A0 =A0 =A0 =A0buf.append(".*"); >>>>> =A0 =A0 =A0 =A0} >>>>> =A0 =A0 =A0 =A0lastStar =3D true; >>>>> =A0 =A0 =A0} else { >>>>> =A0 =A0 =A0 =A0buf.append(Pattern.quote(str)); >>>>> =A0 =A0 =A0 =A0lastStar =3D false; >>>>> =A0 =A0 =A0} >>>>> =A0 =A0} >>>>> =A0 =A0buf.append('$'); >>>>> =A0 =A0return Pattern.compile(buf.toString(), Pattern.CASE_INSENSITIV= E); >>>>> =A0} >>>>> >>>>> Other possile conversions would be * and ? to databse wildcards, so >>>>> perhaps there is scope for a few related methods here? >>>>> >>>>> Stephen >>>>> >>>>> --------------------------------------------------------------------- >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>>>> For additional commands, e-mail: dev-help@commons.apache.org >>>>> >>>>> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>>> For additional commands, e-mail: dev-help@commons.apache.org >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >>> For additional commands, e-mail: dev-help@commons.apache.org >>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org >> For additional commands, e-mail: dev-help@commons.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org > For additional commands, e-mail: dev-help@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org For additional commands, e-mail: dev-help@commons.apache.org