Return-Path: Delivered-To: apmail-httpd-modules-dev-archive@locus.apache.org Received: (qmail 63765 invoked from network); 5 May 2007 17:58:21 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 5 May 2007 17:58:21 -0000 Received: (qmail 26422 invoked by uid 500); 5 May 2007 17:58:27 -0000 Delivered-To: apmail-httpd-modules-dev-archive@httpd.apache.org Received: (qmail 26412 invoked by uid 500); 5 May 2007 17:58:27 -0000 Mailing-List: contact modules-dev-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: modules-dev@httpd.apache.org Delivered-To: mailing list modules-dev@httpd.apache.org Received: (qmail 26401 invoked by uid 99); 5 May 2007 17:58:27 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 May 2007 10:58:27 -0700 X-ASF-Spam-Status: No, hits=2.0 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (herse.apache.org: domain of djwortham@gmail.com designates 64.233.184.237 as permitted sender) Received: from [64.233.184.237] (HELO wr-out-0506.google.com) (64.233.184.237) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 05 May 2007 10:58:19 -0700 Received: by wr-out-0506.google.com with SMTP id q50so1277744wrq for ; Sat, 05 May 2007 10:57:59 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=jaLodHKEt77YDD8BTNylX7Nu1e//zTHMeScaapzJ1GjDDxxyggW7XoraHhoxtU0CAe/FRHp1eqjtBRt+xcESZJyoG0G/w7pa85kpSE8SHZS/AnzpvBWXp2y54HMMhlshkdr5TR53TJ0lOviDOoTFqU5YI9452+G8dqp/kHshAjY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references; b=oFdQTUDaGfU+2FGdC9ttDxL6QCR8vQZvl0KWxRaMADqZ+g7CsfSgqgS31P6RnJ9DfG9nvEsIWQINq7J4Y9COpJYqZE4Jgxtxw3REBnoRcaJBqugzwc14pMOQ6QHwZVy2Py2GOhJ4DkOte2SninNn2MH4xBhdoxbNQlniR6Dp1vc= Received: by 10.114.147.1 with SMTP id u1mr1578348wad.1178387877650; Sat, 05 May 2007 10:57:57 -0700 (PDT) Received: by 10.115.90.13 with HTTP; Sat, 5 May 2007 10:57:57 -0700 (PDT) Message-ID: <5280fae50705051057md1cd5c5t423c3536dc6c18e5@mail.gmail.com> Date: Sat, 5 May 2007 11:57:57 -0600 From: "David Wortham" To: modules-dev@httpd.apache.org Subject: Re: help with ap_escape_uri() In-Reply-To: <20070505183020.19e15fba@Alucard.r3z0> MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_68903_17509001.1178387877600" References: <20070505183020.19e15fba@Alucard.r3z0> X-Virus-Checked: Checked by ClamAV on apache.org ------=_Part_68903_17509001.1178387877600 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Thiabut, As far as I know, URI escaping functions escape all non-alpha numberics which are not in the following set of characters: {'-', '_', ':', '/', '?', '=', '&', '#', '.'} (there may be others I can't think of right now). If a character is in that set of characters, the URI remains "legal" even if the character is unescaped. This set of characters is A reason for this: If you start with a link ( http://www.nowhere.com/some_dir?where_you_going=nowhere#top), there are a number of special characters that are requred to parse the URI correctly. Without these characters: {'/', ':'}, there can be no "http://". Without this character; {'?'}, there is no query string... only a run-on directory-path. Without this character; {'#'}, there is no anchor... only an incorrectly long GET parameter value. This is not a bug; you need to manually escape any of the special characters (probably called URI META characters or something like that) if you expect them to be URL-encoded. If all '&' characters were URI-escaped all of the time, there would be no way to create a GET parameter list; there would never be more than one parameter. As for a workaround, you will need to find a pool-friendly (assuming you are using pools for memory allocation in this specific instance) character/substring replacement function. You will likely want to do a straight encode of all components of a URI seperately with this function then use the ap_escape_uri(). I am not familiar with a particular function that will do the trick, but I use a pool-modified version of a Yahoo! C-library function for URL-encoding. You can probably get this function to URL-encode all characters (or just the '&' character) with mimimal effort. Just modify the "isurlchar(...)" function to suit your needs. BTW - this function should be converted to use pools when allocating C-string memory. The following code is from yahoo_httplib.c (GNU Public License). I found it through Google.com/codesearch /* -------------------------------------------------- */ static int isurlchar(unsigned char c) { return (isalnum(c) || '-' == c || '_' == c); } char *yahoo_urlencode(const char *instr) { int ipos=0, bpos=0; char *str = NULL; int len = strlen(instr); if(!(str = y_new(char, 3*len + 1) )) return ""; while(instr[ipos]) { while(isurlchar(instr[ipos])) str[bpos++] = instr[ipos++]; if(!instr[ipos]) break; snprintf(&str[bpos], 4, "%%%.2x", instr[ipos]); bpos+=3; ipos++; } str[bpos]='\0'; /* free extra alloc'ed mem. */ len = strlen(str); str = y_renew(char, str, len+1); return (str); } /* -------------------------------------------------- */ Dave On 5/5/07, Thibaut VARENE wrote: > > Hi, > > I'm writing mod_musicindex[0], and I have a problem I can't fix: "&" in > filenames aren't escaped into "%26" with ap_escape_uri(), see [1]. > > I've been digging apache source in search of a solution with little > luck, and I was wondering if somebody could tell me 1) why > ap_escape_uri() (or ap_os_escape_path() for that matter) doesn't escape > '&', and what I'm supposed to do to work around that. > > This bug happens with apache 1.3.33 and apache 2.2.3, fwiw. > > TIA > > Thibaut > > PS: Please CC-me in replies > > [0] http://www.parisc-linux.org/~varenet/musicindex/ > [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=421820 > > -- > Thibaut VARENE > http://www.parisc-linux.org/~varenet/ > -- David Wortham Senior Web Applications Developer Unspam Technologies, Inc. (408) 338-8863 ------=_Part_68903_17509001.1178387877600--