db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: SQL/MED
Date Fri, 29 Feb 2008 17:18:57 GMT
Hi Frank,

Some comments inline:

Frank Griffin wrote:
> Rick Hillegas wrote:
>> 1) This could grow arbitrarily complicated, depending on how much
>> heavy lifting you want the optimizer to do for you. With a little
>> luck, you could probably trick the optimizer into treating the foreign
>> tables like Table Functions. That would skirt some of the interesting
>> optimizer issues.
>> 2) A later implementation could get the optimizer involved and you
>> could build some interesting cost model for remote tables.
>> 3) Alternatively, you might be able to get an optimizer expert like
>> Army involved early on. He might be able to help you divide and
>> conquer this one.
> The Derby and SQL/MED optimizer support look pretty similar.  SQL/MED
> has GetReplyFirstCost(), GetReplyExecCost(), and GetReplyReExecCost()
> interface functions that the wrapper (i.e. Table Function) supplies to
> the server.  IIRC Table Functions have the same concepts.
> For the initial pass, I sort of assumed that the ability to push
> predicates to a foreign server capable of handling them would minimize
> the effect of aggressive optimization within Derby, no ?
>> You might want to start a wiki page on this topic and take it from there.
> Will do.  Is there a page template I should use ?  Or a project proposal
> page format used elsewhere ?
People have been hanging their proposals here: 
> As a quick summary, here's the way I *think* SQL/MED works, based on
> some initial reading....
> In passthrough mode, the server (Derby) can send the SQL for the
> subquery directly to the wrapper; this obviously only works for foreign
> SQL servers.  In decomposition mode, the server parses the SQL, and
> separates out the subqueries directed at tables controlled by the
> foreign wrapper.  The server makes interface routines available to the
> wrapper which allow it to navigate the subquery parse tree in the
> SQL/MED format.
> There are almost no data structures involved in this.  It's all
> navigational interface calls which return individual bits of
> information.  That means that the SQL/MED parse model and the server
> parse model don't have to agree, as long as the server can supply the
> information requested by the interface routines.
> Conceptually, there is something called a Value Expression that
> corresponds to a syntax token (simple or complex) in the subquery.  If
> the token is complex, the VE has child VEs, and VEs get broken down to
> child VEs until only the four basic VE types --- COLUMN NAME, CONSTANT,
> OPERATOR, and PARAMETER --- are involved.  The SQL/MED model appears to
> assume that all of these VEs, whether for tokens in a SELECT list,
> expressions in a WHERE or ORDER BY, or whatever, are identified and
> strung out in linear fashion, sort of like an array, at least for
> Replies (see below).  The interface calls allow you to request the Nth
> VE of a certain type, e.g. the Nth item in the SELECT list or the Nth
> ORDER BY item.
> The server provides the wrapper with a Request Handle that represents a
> subquery.  The wrapper navigates through the subquery VEs and creates a
> Reply Handle which is conceptually a copy of the Request with the VEs
> the wrapper can't handle removed.  Just as the server provides
> navigational routines to the wrapper which allow it to walk the Request,
> the wrapper must provide a parallel set of navigational routines to the
> server to allow it to walk the Reply.  I'm not yet sure why, but the
> Request VEs are represented by "handles", while the Reply VEs are
> represented by ordinal numbers (see the array reference above);
> otherwise the set of navigational routines for Requests is pretty much a
> mirror image of the set for Replies.
> The server is responsible for detecting any differences between the
> Request and Reply, and changing its execution plan to pick up the
> wrapper's slack (so to speak). It then requests the wrapper to Open the
> various Replies involved, and Iterates through them, using concepts very
> similar to JDBC ResultSet, and incorporating the retrieved data into its
> execution plan very much like data supplied by a Table Function.
> I see this affecting several areas:
> 1) Derby DDL will need to be extended to include the various CREATE
> FOREIGN XXXX statements.  I'd need to be pointed to the changes that
> were made to add CREATE FUNCTION support for Table Functions.
> 2) Navigational routines need to be added which expose the Derby parse
> tree for subqueries to wrappers.  This should be new code, but I'd need
> to be pointed to code or docs for the parse tree model.
> 3) Derby support for incorporating wrappers into its execution plans
> *should* be quite similar to whatever was done for Table Functions.  The
> main difference will be new code to  analyze a Reply using the wrapper's
> navigational routines, compare it to the original Request, and remove
> any Derby predicate handling that was successfully pushed to the wrapper.
> 4) Data Access while iterating through a ResultSet is conceptually the
> same as JDBC, but the mechanics are not identical, since SQL/MED was
> designed to be language-neutral and Java isn't one of the languages
> specifically mentioned.  There will probably need to be new access code
> written to retrieve rows/columns from wrappers, and I'd need to know
> where to insert this.
> While I intend to make the architecture extensible to the full spec, I
> was thinking of limiting support in the first pass to SELECTs only.  I
> think the spec allows any sort of SQL to be passed to a wrapper, but I
> haven't gotten that far yet.
Lots of good stuff. Limiting the scope of what you want to do and 
chunking the work into small increments are good instincts.


View raw message