httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stas Bekman <s...@stason.org>
Subject Re: the effectiveness of split_to_parms
Date Fri, 19 Apr 2002 15:44:01 GMT
Joe Schaefer wrote:
> Stas Bekman <stas@stason.org> writes:
> 
> 
>>in request.c
>>
>>Any reason why do we call unescape and req_plustospace functions in a 
>>loop, instead of doing this once on *data? I think calling the function 
>>once on a long string is faster then calling the same function 20 times 
>>(if we have 10 fields) on shorter strings.
>>
>>Also what's the use of calling req_plustospace on the key string?
> 
> 
> Because the key string might contain plus signs!

Sorry I don't get you. Let's say that the key string includes plus 
signs, what does it matter when you s/+/ /g? You split on &|; and then 
on =, doing s/+/ / before or after the split is all the same. Unless I 
miss something.

Also is it a part of the RFC that one should s/+/ / in the keys? I think 
+ is used for denoting a list of arguments for the same key.
I don't seem to find the spec for + in the latest URL RFC, which one is 
the latest? 1738?

>>static void split_to_parms(apreq_request_t *req, const char *data)
>>{
>>     request_rec *r = req->r;
>>     const char *val;
>>
>>     while (*data && (val = my_urlword(r->pool, &data))) {
>>	const char *key = ap_getword(r->pool, &val, '=');
>>
>>	req_plustospace((char*)key);
>>	ap_unescape_url((char*)key);
>>	req_plustospace((char*)val);
>>	ap_unescape_url((char*)val);
>>
>>	apr_table_add(req->parms, key, val);
>>     }
>>
>>}
>>
>>I'd write it as:
>>
>>     req_plustospace((char*)data);
>>     ap_unescape_url((char*)data);
> 
>        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> That's wrong.  Moving plustospace outside the loop is OK, but
> you can't safely/correctly unescape *data until AFTER you've 
> split the string using the [;&] tokens.  You might include a
> test case using data from a form like
> 
>   <input name="email address" value="default;blank&amp;unknown" />

Yes, it's clear now. Thanks!

>>     while (*data && (val = my_urlword(r->pool, &data))) {
>>	apr_table_add(req->parms, ap_getword(r->pool, &val, '='), val);
>>     }
>>
>>I'm not sure if it's valid to modify 'val' in the second arg and
>>expect its value to be modified in the third. I must use a tmp var,
>>right? 
> 
> 
> Right- in C parlance, the code exhibits Undefined Behavior.  Given 
> the way many compilers handle function args (right to left),  it's 
> likely your code will do the Wrong Thing here.  A demon might fly 
> out of your nose if you try to run it, or so goes the legend of 
> comp.lang.c :-)

Yup, thanks for confirming that. I think I read this before in the C FAQ.

-- 


__________________________________________________________________
Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker
http://stason.org/     mod_perl Guide ---> http://perl.apache.org
mailto:stas@stason.org http://use.perl.org http://apacheweek.com
http://modperlbook.org http://apache.org   http://ticketmaster.com


Mime
View raw message