Kathey Marsden and I had an offline discussion earlier today regarding the addition of ODBC-compliant metadata functions to the Derby engine. That discussion is summarized below. While feedback on all aspects of this discussion are of course encouraged, sections labelled with "[ **** COMMUNITY INPUT? **** ]" are specific subjects/questions for which input from the Derby community would be helpful and appreciated. If you don't have time or motivation to read this entire email, you might at least want to search for the community input tag (there are 3 in this message) and see what the questions at hand are; we could certainly use any input people might have. ----------------- BACKGROUND: Per Derby-107, I (Army) am working to add ODBC-compliant metadata functions to the Derby engine. This will allow connections made to Derby Network Server via an ODBC driver to retrieve Derby metadata in a format that conforms to the ODBC specification, which can be found here: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcsqlcolumns.asp The following is pasted from the proposal description of DERBY-107: -- begin paste -- Metadata calls over Network Server are executed via system procedures that are built-in to Derby--and these system procedures then map to the SQL statements [ found in Derby's "metadata.properties" file]. The procedures already have an "OPTION" parameter that accepts a string parameter, so what I would propose is that we add logic to recognize a "DATATYPE" keyword as part of this string--basically, if we see the string "DATATYPE='ODBC'" in the OPTION parameter, then the stored procedure would map to a new set of SQL statements in the "metadata.properties" file--and these new statements would return result sets that conform to the ODBC standard. Otherwise, if "DATATYPE='JDBC'" or else no DATATYPE keyword is specified, the current SQL metadata statements would be used by default. -- end paste -- At this point, I have 1) written the relevant ODBC-compliant SQL statements (though I still need to revise and test them, of course!), 2) added those SQL statements to "metadata.properties" file, and 3) added logic to recognize the "DATATYPE='ODBC'" option in the org.apache.derby.catalog.SystemProcedures.java file. EXAMPLE: (Note: this is written insofar as I understand it; if I'm saying anything wrong here, feel free to correct me) Consider the java.sql.DatabaseMetaData method "getColumns". Currently, if an application makes a JDBC call to this getColumns() method against a Derby driver, Derby will map it to an internal system procedure called "SQLCOLUMNS", the definition of which resides in the SystemProcedures class. That method, in turn, fetches an instance of Derby's EmbedDatabaseMetadata class, and then calls Derby's implementation of getColumns() on that instance. The EmbedDatabaseMetadata class then executes the appropriate prepared SQL statement from the "metadata.properties" file, and returns the ResultSet. That said, it is the SQLCOLUMNS procedure in SystemProcedures that takes the OPTION parameter mentioned above, and so it is that method which must recognize the "DATATYPE='ODBC'" keyword. DISCUSSION: The question is now this: what's the best/preferred way to propagate this ODBC/JDBC duality from the "SystemProcedures.java" file to the corresponding methods in org.apache.derby.impl.jdbc.EmbedDatabaseMetadata.java (hereafter referred to as "EDM")? Option I: Add new SQL statements, such as "getColumnsForODBC", to the existing metadata.properties file, as described in the proposal for DERBY-107. Then, since EDM has to know which version of a given SQL statement to execute--for example, should it call the regular "getColumns" version, or should it call the new "getColumnsForODBC" version?--we could add new methods (such as "setForODBC()") to EDM that could be used by SystemProcedures to indicate (to EDM) that ODBC metadata should be returned, intead of JDBC metadata. Note that, since SystemProcedures is in a different package than EDM, the new methods on EDM would (I think) have to be _public_. Regarding this approach, one must ask: [ #1 **** COMMUNITY INPUT? **** ] What's the general attitude toward adding public methods to a Derby class that is implementing a specific JDBC class? In the context of this discussion, is it or is it not acceptable/desireable to add Derby-specific public methods to a class like EmbedDatabaseMetadata.java, which is an implementation of java.sql.DatabaseMetaData? Technically speaking, I don't think the addition of public classes breaks the JDBC standard (so long as we aren't telling people that they can import EmbedDatabaseMetadata in their apps--which we aren't), but I'm curious as to whether there's a "good programming practice" here that the Derby community would like to (or already does?) hold to? [ #1 **** End **** ] Option II: Add new SQL statements, such as "getColumnsForODBC", to the existing metadata.properties file, as described in the proposal for DERBY-107. Then we could extend the EDM class with a new, simple class that sets ODBC-related state, and modify EDM to check the state and execute the appropriate statements. For example, we could add a protected variable "forODBC" to EDM, default it to "false", and then set it to true in the extended class for ODBC. EDM would then check the flag and execute the corresponding metadata statement. The presumption here is that SystemProcedures would check for the ODBC indicator and, if found, use an instance of the new subclass for the metadata calls, instead of using an instance of the existing EDM. This approach allows us to avoid adding new (non-JDBC) public classes to EDM, at the cost of creating another (albeit fairly simple) metadata class. With this approach, we could even go further and add another file, say "odbc_metadata.properties" that holds the ODBC metadata statements (instead of adding them to the existing metadata.properties file). The new subclass could then load _that_ file instead of the current metadata.properties file, which gives us a nice separation of functionality: all of the ODBC code cleanly separated from the JDBC code. Of course, that could be a bad thing, too, since 1) we'd then have TWO metadata files to worry about in the codeline, instead of just one, which introduces room for error if/when metadata-related processing changes occur in Derby, and 2) we'd have to duplicate any SQL statements that are the same for ODBC and JDBC (ex. several of the "getBestRowIdentifier" queries) in both files. So I'm guessing we wouldn't want to create another metadata file...but I thought I'd bring it up, just in case. Option III: Create some kind of internal VTI for ODBC metadata and use that. I have to admit that I don't know too much about how VTIs work, but Kathey gave me some places to look, so I'm going to read up. Apparently, we can execute the same metadata SQL statements that already exist for JDBC, then use a VTI to "massage" the result set into something that complies with ODBC specifications. This might be a good choice given that most of the differences between ODBC and JDBC are in the types of the columns returned. For example, JDBC might say that any String will do, whereas ODBC will say it has to be VARCHAR. In that case, a literal value ' ' will be fine for JDBC, but since it's treated as a CHAR value by Derby, it would be breaking ODBC standard. With a VTI, we could theoretically accomplish the same things that we'd be doing with new SQL statements--such as casting ' ' to VARCHAR in this particular case. Other modifications we'd have to implement include casting certain integer columns to smallints, and replacing null values in JDBC (such as for "sql_data_type" and "buffer_length" columns) to legal values (neither column is supposed to be null for ODBC). Upside to this is that we still only have a single metadata.properties file, which (theoretically) makes maintenance of metadata procedures easier. As I don't know much about VTIs, I can't say what else this approach would require, but it seems safe to say that it would at least require another class to serve as the ODBC VTI. How that would tie into the SystemProcedures and EmbedDatabaseMetadata classes, I don't know yet... So... [ #2 **** COMMUNITY INPUT? **** ] What are people's feelings/reactions/comments in terms of which of the above three options is most appropriate for the Derby codeline? In a nutshell, it comes to 1) new public methods on EmbedDatabaseMetadata, 2) new subclass to EmbedDatabaseMetadata that sets ODBC-related state (with or without creating a new odbc_metadata.properties file), and 3) create an ODBC VTI to manipulate existing JDBC result sets into ODBC-compliant ones. NOTE: another thing we have to decide is if we're going to support different levels of ODBC metadata--for example, ODBC 1.0 metadata differs from ODBC 2.0 in some respects, and likewise for ODBC 3.0. Do we want to support all of the different levels? If so, which of the above approaches is going to make that most feasible? My guess is that option III would be the best choice in that case... [ #2 **** End **** ] The final part of Kathey's and my discussion involved the issue of Derby upgrade. While it IS true that, as Dan said, "no upgrade code has been added to support upgrading databases from 10.0 to 10.1" (see http://nagoya.apache.org/eyebrowse/ReadMsg?listName=derby-dev@db.apache.org&msgNo=1703), it nonetheless seems like a good idea to consider what effects on upgrade the changes to Derby metadata might have. [ #3 **** COMMUNITY INPUT? **** ] Does anyone know anything about the upgrade implications for any of the above-mentioned approaches? Are there caveats to be considered, or is there any advice that might save us headaches in the future when support for upgrades is developed? For example, with Cloudscape upgrade/downgrade, did Cloudscape drop and then recreate the system stored prepared statements? And if so, what are the odds that Derby will do the same? Would that be automatic, or would we have to add logic for each of the prepared statements in question? Apologies in advance for my lack of knowledge in that area... [ #3 **** End **** ] ----------------- Thanks to everyone who took the time to read all or part of this email. Feeback is greatly appreciated... Army