Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 24729 invoked from network); 11 Jun 2004 03:45:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 11 Jun 2004 03:45:40 -0000 Received: (qmail 79985 invoked by uid 500); 11 Jun 2004 03:46:01 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 79935 invoked by uid 500); 11 Jun 2004 03:46:00 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 79920 invoked by uid 99); 11 Jun 2004 03:46:00 -0000 Date: Thu, 10 Jun 2004 23:45:24 -0400 From: Dmitri Tikhonov To: dev@apr.apache.org Subject: Case-sensitive months in apr_date_parse_rfc Message-ID: <20040611034524.GA26611@netilla.com> Mail-Followup-To: dev@apr.apache.org Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LQksG6bCIzRHxTLp" Content-Disposition: inline User-Agent: Mutt/1.5.6i X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N --LQksG6bCIzRHxTLp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Today I saw this date string out in the wild: 'saturday, 11-jun-2004 16:10:27 GMT' While looks very much like a date format for which I submitted a patch over a year ago[1], its month starts with a lower-case letter. This string cannot be parsed using apr_date_parse_rfc. I am sure in the myriad of date string specifications there are rules against lower-case month names, but the function in question wants to be lenient in what it parses. I wanted to submit a patch where apr_date_parse_rfc does not care about month cases (achieved by adding a support for a symbol that would match a letter of any capitalization, changing the masks correspondingly, and converting monstr to its usual @$$ form before checking months array), when I noticed that apr_date_checkmask is not a static function and others may use it in a way that would not allow introduction of a new symbol for isalpha[2]. The question therefore becomes, what is the best way to transform apr_date_parse_rfc to support case-insensitive months without sacrificing much performance and breaking the APIs by introducing a new special symbol? One way is to make a maximum of eight calls to apr_date_checkmask instead of one for each format: "@@@" || "@@$" || "@$@"... instead of the current "@$$", but it is just yucky[3]. Given the constraints, I am stuck. Suggestions? - Dmitri. [1] http://cvs.apache.org/viewcvs.cgi/apr-util/misc/apr_date.c?r1=1.15&r2=1.16 [2] From apr_date_checkmask description: - exact match for any other character [3] This could probably be optimized in a loop whose first iteration would catch 99% of cases where month string fits @$$ format, and then, if no matches are found, mangle "@$$" seven more times until something (or nothing) is found. --LQksG6bCIzRHxTLp Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFAySrTEVjlCyBA0pIRAs6QAKCBbM8k9FgRe+TfqE1qTemlw8M3PACfb7Pg FEIpzeokDX9aRCkf+nFaZ3o= =ycC8 -----END PGP SIGNATURE----- --LQksG6bCIzRHxTLp--