httpd-modules-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Wortham" <djwort...@gmail.com>
Subject Re: help with ap_escape_uri()
Date Sat, 05 May 2007 17:57:57 GMT
Thiabut,
   As far as I know, URI escaping functions escape all non-alpha numberics
which are not in the following set of characters: {'-', '_', ':', '/', '?',
'=', '&', '#', '.'} (there may be others I can't think of right now).  If a
character is in that set of characters, the URI remains "legal" even if the
character is unescaped.  This set of characters is

   A reason for this:
If you start with a link (
http://www.nowhere.com/some_dir?where_you_going=nowhere#top), there are a
number of special characters that are requred to parse the URI correctly.
Without these characters: {'/', ':'}, there can be no "http://".
Without this character; {'?'}, there is no query string... only a run-on
directory-path.
Without this character; {'#'}, there is no anchor... only an incorrectly
long GET parameter value.

   This is not a bug; you need to manually escape any of the special
characters (probably called URI META characters or something like that) if
you expect them to be URL-encoded.  If all '&' characters were URI-escaped
all of the time, there would be no way to create a GET parameter list; there
would never be more than one parameter.

   As for a workaround, you will need to find a pool-friendly (assuming you
are using pools for memory allocation in this specific instance)
character/substring replacement function.  You will likely want to do a
straight encode of all components of a URI seperately with this function
then use the ap_escape_uri().  I am not familiar with a particular function
that will do the trick, but I use a pool-modified version of a Yahoo!
C-library function for URL-encoding.

You can probably get this function to URL-encode all characters (or just the
'&' character) with mimimal effort.  Just modify the "isurlchar(...)"
function to suit your needs.  BTW - this function should be converted to use
pools when allocating C-string memory.

The following code is from yahoo_httplib.c (GNU Public License).  I found it
through Google.com/codesearch
/* -------------------------------------------------- */

static int isurlchar(unsigned char c)
{
	return (isalnum(c) || '-' == c || '_' == c);
}

char *yahoo_urlencode(const char *instr)
{
	int ipos=0, bpos=0;
	char *str = NULL;
	int len = strlen(instr);

	if(!(str = y_new(char, 3*len + 1) ))
		return "";

	while(instr[ipos]) {
		while(isurlchar(instr[ipos]))
			str[bpos++] = instr[ipos++];
		if(!instr[ipos])
			break;

		snprintf(&str[bpos], 4, "%%%.2x", instr[ipos]);
		bpos+=3;
		ipos++;
	}
	str[bpos]='\0';

	/* free extra alloc'ed mem. */
	len = strlen(str);
	str = y_renew(char, str, len+1);

	return (str);
}

/* -------------------------------------------------- */

Dave








On 5/5/07, Thibaut VARENE <T-Bone@parisc-linux.org> wrote:
>
> Hi,
>
> I'm writing mod_musicindex[0], and I have a problem I can't fix: "&" in
> filenames aren't escaped into "%26" with ap_escape_uri(), see [1].
>
> I've been digging apache source in search of a solution with little
> luck, and I was wondering if somebody could tell me 1) why
> ap_escape_uri() (or ap_os_escape_path() for that matter) doesn't escape
> '&', and what I'm supposed to do to work around that.
>
> This bug happens with apache 1.3.33 and apache 2.2.3, fwiw.
>
> TIA
>
> Thibaut
>
> PS: Please CC-me in replies
>
> [0] http://www.parisc-linux.org/~varenet/musicindex/
> [1] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=421820
>
> --
> Thibaut VARENE
> http://www.parisc-linux.org/~varenet/
>



-- 
David Wortham
Senior Web Applications Developer
Unspam Technologies, Inc.
(408) 338-8863

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message