Return-Path: Delivered-To: apmail-httpd-apreq-dev-archive@httpd.apache.org Received: (qmail 50825 invoked by uid 500); 19 Apr 2002 15:44:01 -0000 Mailing-List: contact apreq-dev-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: Delivered-To: mailing list apreq-dev@httpd.apache.org Received: (qmail 50813 invoked from network); 19 Apr 2002 15:44:00 -0000 Message-ID: <3CC03B41.4000100@stason.org> Date: Fri, 19 Apr 2002 23:44:01 +0800 From: Stas Bekman Organization: Hope, Humanized User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.9) Gecko/00200203 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Joe Schaefer Cc: apreq-dev@httpd.apache.org Subject: Re: the effectiveness of split_to_parms References: <3CBFD155.1020201@stason.org> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Joe Schaefer wrote: > Stas Bekman writes: > > >>in request.c >> >>Any reason why do we call unescape and req_plustospace functions in a >>loop, instead of doing this once on *data? I think calling the function >>once on a long string is faster then calling the same function 20 times >>(if we have 10 fields) on shorter strings. >> >>Also what's the use of calling req_plustospace on the key string? > > > Because the key string might contain plus signs! Sorry I don't get you. Let's say that the key string includes plus signs, what does it matter when you s/+/ /g? You split on &|; and then on =, doing s/+/ / before or after the split is all the same. Unless I miss something. Also is it a part of the RFC that one should s/+/ / in the keys? I think + is used for denoting a list of arguments for the same key. I don't seem to find the spec for + in the latest URL RFC, which one is the latest? 1738? >>static void split_to_parms(apreq_request_t *req, const char *data) >>{ >> request_rec *r = req->r; >> const char *val; >> >> while (*data && (val = my_urlword(r->pool, &data))) { >> const char *key = ap_getword(r->pool, &val, '='); >> >> req_plustospace((char*)key); >> ap_unescape_url((char*)key); >> req_plustospace((char*)val); >> ap_unescape_url((char*)val); >> >> apr_table_add(req->parms, key, val); >> } >> >>} >> >>I'd write it as: >> >> req_plustospace((char*)data); >> ap_unescape_url((char*)data); > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > That's wrong. Moving plustospace outside the loop is OK, but > you can't safely/correctly unescape *data until AFTER you've > split the string using the [;&] tokens. You might include a > test case using data from a form like > > Yes, it's clear now. Thanks! >> while (*data && (val = my_urlword(r->pool, &data))) { >> apr_table_add(req->parms, ap_getword(r->pool, &val, '='), val); >> } >> >>I'm not sure if it's valid to modify 'val' in the second arg and >>expect its value to be modified in the third. I must use a tmp var, >>right? > > > Right- in C parlance, the code exhibits Undefined Behavior. Given > the way many compilers handle function args (right to left), it's > likely your code will do the Wrong Thing here. A demon might fly > out of your nose if you try to run it, or so goes the legend of > comp.lang.c :-) Yup, thanks for confirming that. I think I read this before in the C FAQ. -- __________________________________________________________________ Stas Bekman JAm_pH ------> Just Another mod_perl Hacker http://stason.org/ mod_perl Guide ---> http://perl.apache.org mailto:stas@stason.org http://use.perl.org http://apacheweek.com http://modperlbook.org http://apache.org http://ticketmaster.com