Return-Path: Delivered-To: apmail-apr-dev-archive@www.apache.org Received: (qmail 60745 invoked from network); 28 Jun 2006 23:09:52 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 28 Jun 2006 23:09:52 -0000 Received: (qmail 84468 invoked by uid 500); 28 Jun 2006 23:09:51 -0000 Delivered-To: apmail-apr-dev-archive@apr.apache.org Received: (qmail 84419 invoked by uid 500); 28 Jun 2006 23:09:51 -0000 Mailing-List: contact dev-help@apr.apache.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Id: Delivered-To: mailing list dev@apr.apache.org Received: (qmail 84408 invoked by uid 99); 28 Jun 2006 23:09:50 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jun 2006 16:09:50 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of bojan@rexursive.com designates 203.171.74.242 as permitted sender) Received: from [203.171.74.242] (HELO beauty.rexursive.com) (203.171.74.242) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 28 Jun 2006 16:09:49 -0700 Received: by beauty.rexursive.com (Postfix, from userid 48) id F26CD1AF281; Thu, 29 Jun 2006 09:09:26 +1000 (EST) Received: from cache4.syd.ops.aspac.uu.net (cache4.syd.ops.aspac.uu.net [203.166.96.238]) by www.rexursive.com (Horde MIME library) with HTTP; Thu, 29 Jun 2006 09:09:26 +1000 Message-ID: <20060629090926.zkraogsvogcoo4sg@www.rexursive.com> Date: Thu, 29 Jun 2006 09:09:26 +1000 From: Bojan Smojver To: dev@apr.apache.org Subject: Re: Binary data in apr dbd - where should buckets come from References: <20060628051313.56816.qmail@web36714.mail.mud.yahoo.com> In-Reply-To: <20060628051313.56816.qmail@web36714.mail.mud.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Internet Messaging Program (IMP) H3 (4.1.1) X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Quoting Alex Dubov : > I was working on my long pending changes to > apr_dbd_mysql when I found that I have no clue how to > return buckets from apr_dbd_get_entry. > Some problems: > 1. Which bucket_alloc should I use and who gets it > (apr_dbd_get_row or apr_dbd_get_entry)? > > 2. There's some gains in having pool passed to > apr_dbd_get_entry (to defer the check for truncation > until the value is really needed). In general, it may > be better to hold a pool passed to apr_dbd_pselect, > instead of passing a new one to get_row/get_entry. > > 3. Alternatively, I can use buckets for everything > (including strings and simple types). In this case, > apr_dbd_p(v)select should get bucket_alloc instead. > apr_dbd_get_entry then may choose to return pointer to > bucket or pointer to bucket's content. I would leave all the functions we have now the way they are - they =20 should just take and return const char *, but should encode in ASCII =20 for types like BLOBs (i.e. length:[column:table:]payload business). We =20 already have this interface (i.e. the "string way") and it is handy, =20 so I think we should keep it. We should introduce new functions for all binary stuff. As I presented =20 in one of my previous e-mails to Chris: ----------------------------------------------------- As for strings v. natives v. structures (i.e. your point 6), I think we should handle this by having a whole set of new functions for this purpose. I would personally keep the existing functions the way they are, because: - passing strings in/out for everything is handy - it ensures backward compatibility I would enhance this behaviour for existing functions to understand a few more things, like floats, doubles, longs, shorts, timestamps (all passed in as strings) and BLOBs (ASCII encoded, as I originally suggested). This would give us a whole lot of "strings only" stuff to work with. Not sure if formatting strings in apr_vformatter should really be related to SQL data type info we are passing in/out here, but if the list thinks this is the right way to go, I have no problem with it. So, now that we have the "strings" API out of the way, I think we should also introduce a new "binary" API for native data types. First, I would keep the _prepare identical for both "strings" and "binary" interface. The formats used should be able to cater for both. In this phase, we just "hint" what is to be expected, but not "hardcode" anything. Then, we can have "binary" equivalents of p[v]query/select and get_entry. Here is how I see them working: All "simple" types like int, long, float, double etc. are passed in by pointer only. No need to employ any kind of wrapper structure - lengths and types are known by the compiler and we can map those directly to SQL types too. Some other types, like timestamps, dates and times are probably best passed as their string representations, as this is what SQL backends can work with, as well as C native APIs, through conversion functions. BLOBs and such could be passed in through a structure defining all required elements (length and binary data), including the infamous column/table info for Oracle. The binary equivalent of get_entry would then return relevant pointers. The caller already knows what that is - he/she is the one doing this in the first place, no need to wrap all this with unneeded info. Basically, I'm trying to take the shortest path from A to B. If we can pass native as is, we do. If we can "cheat" by using strings, we do. For everything else, we do "proper". In other words, if it needs wrapping, we wrap. So, the caller can follow one of two paths: apr_dbd_prepare() apr_dbd_p[v]query/select() --> takes all args as strings apr_dbd_get_row() apr_dbd_get_entry() --> returns all strings or apr_dbd_prepare() apr_dbd_bp[v]query/select() --> takes various pointer args apr_dbd_get_row() apr_dbd_get_bentry() --> returns void * Obviously, we'd have some meaningful function names for all this. ----------------------------------------------------- I think the first order of business would be to put parsing of SQL =20 queries into the public function (i.e. apr_dbd_prepare()), so that =20 this part is always done exactly the same way for all backends. We =20 could then pass an extra argument to underlying driver function (this =20 wouldn't break binary compatiblity, as those functions "don't exist" =20 from caller's point of view), which would be "pointer to an array of =20 types of parameters" that we parsed, expressed in DBD speak (i.e. we =20 could have an enumerated type for this). Once the driver functions get this, all they need to do is prepare the =20 statement accordingly (i.e. in the backend specific way). Then, [b]p[v]select/query functions can fetch arguments either as =20 const char * (the "string way") or as other types of pointers (the =20 "binary way") and use them. Finally, the get_[b]entry can return =20 either const char * (the "string way") or other type of pointer (the =20 "binary way"). And since we know what underlying SQL types map to in C =20 land (because that's what our "binary" interface definition is all =20 about), we don't need any "formatting" for fetching. The caller =20 already knows what's going to come back - he/she defined the SQL =20 columns after all. Ah yes, the buckets/brigades... I'd use them only when required - for =20 types likes BLOB and maybe binary TEXT, where we may need to get stuff =20 in multiple chunks due to size. At least that's my take... --=20 Bojan