Return-Path: X-Original-To: apmail-mesos-dev-archive@www.apache.org Delivered-To: apmail-mesos-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C79D9176C3 for ; Thu, 20 Nov 2014 23:03:29 +0000 (UTC) Received: (qmail 22477 invoked by uid 500); 20 Nov 2014 23:03:29 -0000 Delivered-To: apmail-mesos-dev-archive@mesos.apache.org Received: (qmail 22395 invoked by uid 500); 20 Nov 2014 23:03:29 -0000 Mailing-List: contact dev-help@mesos.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mesos.apache.org Delivered-To: mailing list dev@mesos.apache.org Received: (qmail 22377 invoked by uid 99); 20 Nov 2014 23:03:29 -0000 Received: from reviews-vm.apache.org (HELO reviews.apache.org) (140.211.11.40) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Nov 2014 23:03:29 +0000 Received: from reviews.apache.org (localhost [127.0.0.1]) by reviews.apache.org (Postfix) with ESMTP id 579C011710D; Thu, 20 Nov 2014 23:03:28 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============7183510578114992189==" MIME-Version: 1.0 Subject: Re: Review Request 28253: Add query string joiner From: "Ben Mahler" To: "Cody Maloney" , "Ben Mahler" , "mesos" Date: Thu, 20 Nov 2014 23:03:28 -0000 Message-ID: <20141120230328.16171.76823@reviews.apache.org> X-ReviewBoard-URL: https://reviews.apache.org Auto-Submitted: auto-generated Sender: "Ben Mahler" X-ReviewGroup: mesos X-ReviewRequest-URL: https://reviews.apache.org/r/28253/ X-Sender: "Ben Mahler" References: <20141120211157.15325.74299@reviews.apache.org> In-Reply-To: <20141120211157.15325.74299@reviews.apache.org> Reply-To: "Ben Mahler" X-ReviewRequest-Repository: mesos-git --===============7183510578114992189== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit > On Nov. 20, 2014, 9:11 p.m., Ben Mahler wrote: > > 3rdparty/libprocess/include/process/http.hpp, lines 415-416 > > > > > > Here's what we'll have if we added join: > > > > parse: takes a __decoded__ query string, construct a key,value map of the parameters > > > > join: takes a key,value map of parameters, creates an __encoded__ query string. > > > > That's where I find this to be non-intuitive: > > > > ``` > > join(parse(query)) == encode(query) > > ``` > > > > It seems like only request encoding should deal with constructing an encoded query string, is that not the case? Would like to avoid exposing this if possible, and if not possible, would like to make this more intuitive. > > Cody Maloney wrote: > So that behavior of parse is going to produce incorrect results I believe. Suppose one of my parameters in my query string is a small blob of JSON (Contains '=' and '&' potentially). If we encode/decode on the outside of query string parsing/joining, we can't represent this (The & will look like the start of a new parameter). > > ``` > std::map<> query = {{"a%20","b&c=d!@#foobar"}} > > EXPECT_EQ(query, parse(join(query))); > ``` > > Based on my reading of rfc3986 2.1 (snippet inline): > ``` > A percent-encoding mechanism is used to represent a data octet in a > component when that octet's corresponding character is outside the > allowed set or is being used as a delimiter of, or within, the > component. > ``` > > This should work, and is the reason why there is percent encoding. From what I can tell grepping throuhg the source code nothing currently does the decoding before we parse the query string, so making the decode live inside of the parse function should work. Currently callers just end up with percent encoded chunks in their query string, which causes badness (And could explain an issue which a customer was having with files/browse.json and things not showing up) > > Ben Mahler wrote: > Hm.. so I did the grep and as expected the callers parse decoded strings. > > The original use of parse is in Decoder: > > ``` > static int on_message_complete(http_parser* p) > { > DataDecoder* decoder = (DataDecoder*) p->data; > > // Parse the query key/values. > Try decoded = http::decode(decoder->query); > if (decoded.isError()) { > return 1; > } > decoder->request->query = http::query::parse(decoded.get()); > ... > ``` > > Note that here we parse a decoded string, as expected. The other use appears to be within the http code in the master that parses request __bodies__: > > ``` > Future Master::Http::shutdown(const Request& request) > { > if (request.method != "POST") { > return BadRequest("Expecting POST"); > } > > // Parse the query string in the request body (since this is a POST) > // in order to determine the framework ID to shutdown. > hashmap values = > process::http::query::parse(request.body); > ``` > > Bodies are not expected to be percent encoded, IIUC. > > Anything I missed...? > > Cody Maloney wrote: > So we do, not sure how I missed that... > > So I can fix up the decoding to be inside the query string. Simple extra patch, then add some round trip tests. Sound good? > > The body one is interesting in that it looks like we're doing it incorrectly: > http://stackoverflow.com/questions/4007969/application-x-www-form-urlencoded-or-multipart-form-data/4073451#4073451 > > We're assuming application/x-www-form-urlencoded at the moment which we should decode. None of the requests happen to be from non-libprocess things or contain illegal characters so it happens to work. > > *random side thoughts*: We aren't a very good HTTP/webserver / parsing into request/response objects, processing Content-Type, doing simple clean routing. The more I dig into this code the more I think it would be nice if there was just an nginx, apache, proxygen / etc. which are good at doing the HTTP parts and we just take their parsing + methods, getting us out of the game of writing an HTTP server, then just have the bits to bridge / convert their request objects to the appropriate libprocess dispatches. > So I can fix up the decoding to be inside the query string. Do you mean decoding to be inside query::parse? Sounds bad because it cannot assume the string is always encoded and decoding is not idempotent IIUC. If you _really_ need this, let's just provide the map -> string functionality, and leave the encoding responsibility to the Request encoder. As much as possible, the Encoder / Decoder code should be the only code responsible for encoding and decoding, but maybe I'm missing some context on this. - Ben ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/28253/#review62422 ----------------------------------------------------------- On Nov. 20, 2014, 12:50 a.m., Cody Maloney wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/28253/ > ----------------------------------------------------------- > > (Updated Nov. 20, 2014, 12:50 a.m.) > > > Review request for mesos. > > > Bugs: MESOS-2132 > https://issues.apache.org/jira/browse/MESOS-2132 > > > Repository: mesos-git > > > Description > ------- > > This is needed to turn Request objects into http request which we send across the wire. The existing HTTP request sending code expects the query string to be already constructed. > > Ideally people can either just forward the query string they get to send it back across the wire or they can manipulate it using the native C++ datatype (A hashmap), and then have it automatically re-encoded. > > The parsing logic to go from a string to a map is already exists in this header, this just adds the other direction. > > > Diffs > ----- > > 3rdparty/libprocess/include/process/http.hpp 9cf05acbb724ab9af8010d1788621d37a0e48e86 > 3rdparty/libprocess/src/http.cpp b00f33339366f5c06b6f20e38c5ae0c23b8a9358 > 3rdparty/libprocess/src/tests/http_tests.cpp a90e65f77904da0a45e1cc0cc9889ae69354a1a5 > > Diff: https://reviews.apache.org/r/28253/diff/ > > > Testing > ------- > > make distcheck ubuntu 14.04 > > > Thanks, > > Cody Maloney > > --===============7183510578114992189==--