db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Army <a...@golux.com>
Subject On DERBY-107 : ODBC Metadata functions
Date Fri, 21 Jan 2005 22:49:43 GMT

Kathey Marsden and I had an offline discussion earlier today regarding the addition of ODBC-compliant
metadata functions 
to the Derby engine.  That discussion is summarized below.  While feedback on all aspects
of this discussion are of 
course encouraged, sections labelled with "[ **** COMMUNITY INPUT? **** ]" are specific subjects/questions
for which 
input from the Derby community would be helpful and appreciated.

If you don't have time or motivation to read this entire email, you might at least want to
search for the community 
input tag (there are 3 in this message) and see what the questions at hand are; we could certainly
use any input people 
might have.

-----------------

BACKGROUND:

Per Derby-107, I (Army) am working to add ODBC-compliant metadata functions to the Derby engine.
 This will allow 
connections made to Derby Network Server via an ODBC driver to retrieve Derby metadata in
a format that conforms to the 
ODBC specification, which can be found here:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlcolumns.asp

The following is pasted from the proposal description of DERBY-107:

-- begin paste --

Metadata calls over Network Server are executed via system procedures that are built-in to
Derby--and these system 
procedures then map to the SQL statements [ found in Derby's "metadata.properties" file].
 The procedures already have 
an "OPTION" parameter that accepts a string parameter, so what I would propose is that we
add logic to recognize a 
"DATATYPE" keyword as part of this string--basically, if we see the string "DATATYPE='ODBC'"
in the OPTION parameter, 
then the stored procedure would map to a new set of SQL statements in the "metadata.properties"
file--and these new 
statements would return result sets that conform to the ODBC standard.  Otherwise, if "DATATYPE='JDBC'"
or else no 
DATATYPE keyword is specified, the current SQL metadata statements would be used by default.

-- end paste --

At this point, I have 1) written the relevant ODBC-compliant SQL statements (though I still
need to revise and test 
them, of course!), 2) added those SQL statements to "metadata.properties" file, and 3) added
logic to recognize the 
"DATATYPE='ODBC'" option in the org.apache.derby.catalog.SystemProcedures.java file.

EXAMPLE:

(Note: this is written insofar as I understand it; if I'm saying anything wrong here, feel
free to correct me)

Consider the java.sql.DatabaseMetaData method "getColumns".  Currently, if an application
makes a JDBC call to this 
getColumns() method against a Derby driver, Derby will map it to an internal system procedure
called "SQLCOLUMNS", the 
definition of which resides in the SystemProcedures class.  That method, in turn, fetches
an instance of Derby's 
EmbedDatabaseMetadata class, and then calls Derby's implementation of getColumns() on that
instance.  The 
EmbedDatabaseMetadata class then executes the appropriate prepared SQL statement from the
"metadata.properties" file, 
and returns the ResultSet.

That said, it is the SQLCOLUMNS procedure in SystemProcedures that takes the OPTION parameter
mentioned above, and so it 
is that method which must recognize the "DATATYPE='ODBC'" keyword.

DISCUSSION:

The question is now this: what's the best/preferred way to propagate this ODBC/JDBC duality
from the 
"SystemProcedures.java" file to the corresponding methods in org.apache.derby.impl.jdbc.EmbedDatabaseMetadata.java

(hereafter referred to as "EDM")?

Option I:

Add new SQL statements, such as "getColumnsForODBC", to the existing metadata.properties file,
as described in the 
proposal for DERBY-107.  Then, since EDM has to know which version of a given SQL statement
to execute--for example, 
should it call the regular "getColumns" version, or should it call the new "getColumnsForODBC"
version?--we could add 
new methods (such as "setForODBC()") to EDM that could be used by SystemProcedures to indicate
(to EDM) that ODBC 
metadata should be returned, intead of JDBC metadata.  Note that, since SystemProcedures is
in a different package than 
EDM, the new methods on EDM would (I think) have to be _public_.

Regarding this approach, one must ask:

[ #1 **** COMMUNITY INPUT? **** ]

What's the general attitude toward adding public methods to a Derby class that is implementing
a specific JDBC class? 
In the context of this discussion, is it or is it not acceptable/desireable to add Derby-specific
public methods to a 
class like EmbedDatabaseMetadata.java, which is an implementation of java.sql.DatabaseMetaData?
 Technically speaking, I 
don't think the addition of public classes breaks the JDBC standard (so long as we aren't
telling people that they can 
import EmbedDatabaseMetadata in their apps--which we aren't), but I'm curious as to whether
there's a "good programming 
practice" here that the Derby community would like to (or already does?) hold to?

[ #1 **** End **** ]

Option II:

Add new SQL statements, such as "getColumnsForODBC", to the existing metadata.properties file,
as described in the 
proposal for DERBY-107.  Then we could extend the EDM class with a new, simple class that
sets ODBC-related state, and 
modify EDM to check the state and execute the appropriate statements.  For example, we could
add a protected variable 
"forODBC" to EDM, default it to "false", and then set it to true in the extended class for
ODBC.  EDM would then check 
the flag and execute the corresponding metadata statement.  The presumption here is that SystemProcedures
would check 
for the ODBC indicator and, if found, use an instance of the new subclass for the metadata
calls, instead of using an 
instance of the existing EDM.

This approach allows us to avoid adding new (non-JDBC) public classes to EDM, at the cost
of creating another (albeit 
fairly simple) metadata class.

With this approach, we could even go further and add another file, say "odbc_metadata.properties"
that holds the ODBC 
metadata statements (instead of adding them to the existing metadata.properties file).  The
new subclass could then load 
_that_ file instead of the current metadata.properties file, which gives us a nice separation
of functionality: all of 
the ODBC code cleanly separated from the JDBC code.  Of course, that could be a bad thing,
too, since 1) we'd then have 
TWO metadata files to worry about in the codeline, instead of just one, which introduces room
for error if/when 
metadata-related processing changes occur in Derby, and 2) we'd have to duplicate any SQL
statements that are the same 
for ODBC and JDBC (ex. several of the "getBestRowIdentifier" queries) in both files.  So I'm
guessing we wouldn't want 
to create another metadata file...but I thought I'd bring it up, just in case.

Option III:

Create some kind of internal VTI for ODBC metadata and use that.  I have to admit that I don't
know too much about how 
VTIs work, but Kathey gave me some places to look, so I'm going to read up.  Apparently, we
can execute the same 
metadata SQL statements that already exist for JDBC, then use a VTI to "massage" the result
set into something that 
complies with ODBC specifications.  This might be a good choice given that most of the differences
between ODBC and JDBC 
are in the types of the columns returned.  For example, JDBC might say that any String will
do, whereas ODBC will say it 
has to be VARCHAR.  In that case, a literal value ' ' will be fine for JDBC, but since it's
treated as a CHAR value by 
Derby, it would be breaking ODBC standard.  With a VTI, we could theoretically accomplish
the same things that we'd be 
doing with new SQL statements--such as casting ' ' to VARCHAR in this particular case.  Other
modifications we'd have to 
implement include casting certain integer columns to smallints, and replacing null values
in JDBC (such as for 
"sql_data_type" and "buffer_length" columns) to legal values (neither column is supposed to
be null for ODBC).

Upside to this is that we still only have a single metadata.properties file, which (theoretically)
makes maintenance of 
metadata procedures easier.  As I don't know much about VTIs, I can't say what else this approach
would require, but it 
seems safe to say that it would at least require another class to serve as the ODBC VTI. 
How that would tie into the 
SystemProcedures and EmbedDatabaseMetadata classes, I don't know yet...

So...

[ #2 **** COMMUNITY INPUT? **** ]

What are people's feelings/reactions/comments in terms of which of the above three options
is most appropriate for the 
Derby codeline?  In a nutshell, it comes to 1) new public methods on EmbedDatabaseMetadata,
2) new subclass to 
EmbedDatabaseMetadata that sets ODBC-related state (with or without creating a new odbc_metadata.properties
file), and 
3) create an ODBC VTI to manipulate existing JDBC result sets into ODBC-compliant ones.

NOTE: another thing we have to decide is if we're going to support different levels of ODBC
metadata--for example, ODBC 
1.0 metadata differs from ODBC 2.0 in some respects, and likewise for ODBC 3.0.  Do we want
to support all of the 
different levels?  If so, which of the above approaches is going to make that most feasible?
 My guess is that option 
III would be the best choice in that case...

[ #2 **** End **** ]

The final part of Kathey's and my discussion involved the issue of Derby upgrade.  While it
IS true that, as Dan said, 
"no upgrade code has been added to support upgrading databases from 10.0 to 10.1" (see 
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=derby-dev@db.apache.org&msgNo=1703),
it nonetheless seems like a 
good idea to consider what effects on upgrade the changes to Derby metadata might have.

[ #3 **** COMMUNITY INPUT? **** ]

Does anyone know anything about the upgrade implications for any of the above-mentioned approaches?
 Are there caveats 
to be considered, or is there any advice that might save us headaches in the future when support
for upgrades is 
developed?  For example, with Cloudscape upgrade/downgrade, did Cloudscape drop and then recreate
the system stored 
prepared statements?  And if so, what are the odds that Derby will do the same?  Would that
be automatic, or would we 
have to add logic for each of the prepared statements in question?  Apologies in advance for
my lack of knowledge in 
that area...

[ #3 **** End **** ]

-----------------

Thanks to everyone who took the time to read all or part of this email.

Feeback is greatly appreciated...
Army

Mime
View raw message