db-derby-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Db-derby Wiki] Update of "MysqlDerbyMigration/DesignDocument" by RaminMoazeni
Date Thu, 22 Jun 2006 02:49:12 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Db-derby Wiki" for change notification.

The following page has been changed by RaminMoazeni:
http://wiki.apache.org/db-derby/MysqlDerbyMigration/DesignDocument

New page:
= MySQL to Apache Derby Migration Tool =

== Ramin Moazeni ==

== Version 1.1 ==

  
= Introduction =

When you work with databases, you will often need to copy data from one database to another.
The Apache Derby database currently is lacking a migration tool that that enables you to quickly
migrate popular databases and data to Apache Derby database. Using a Graphical User Interface
(GUI), the Derby migration tool will assist in successfully migrating databases. The use of
Derby migration tool will assist in migration of databases including schema objects such as
tables, indexes, views, and stored procedures to Apache Derby database. The advantage of using
the migration tool will be Cost Saving, improved productivity, and reduced risks. 

I propose to implement the migration tool that will support migration of tables, views, indexes,
and stored procedures. The steps required to migrate from MySQL to Apache Derby generally
involves: 

 * Migration of the schema and data 
 * Migration of database objects such as tables, views, indexes, and stored procedures
 
This paper serves as a proposal for implementing the MySQL to Derby migration tool. The implementation
language for this project is Java since it can be run on different platforms.  

= The Migration Process =
The MySQL to Apache Derby Migration tool is a framework that enables migration and conversion
of data from MySQL database to Apache Derby database. The system consists of several components
responsible for capturing source database schema, database object mapping, datatype mapping
and capturing source data to migrate a MySQL database to Apache Derby. 

This following diagram describes processes used by the system to implement its capabilities.
It explains how the system interacts with its actors (i.e users and databases) and the specific
steps and interactions will the users have with the system. 

http://www-scf.usc.edu/~moazeni/GSoC/mysql_7.jpg

The above processes, tasks and interactions are summarized as the following use cases:

 * UC-1: Select Source DBMS    
   The user will select the source DBMS (i.e mysql, oracle, etc)

 * UC-2: Enter Source DBMS Connection Parameters 
   To enter information of the source Database Management System such as database IP, Port
   information, username and password. 

 * UC-3: Enter Target DBMS Connection Parameters 
   To enter information of the target Database Management System such as database IP, 
   Port information, username and password. 

 * UC-4: Select DB to be Migrated 
   The migration tool will capture the databases of the source DBMS. The user can select
   the database that needs to be migrated. 

 * UC-5: Choose DB Objects to be Migrated 
   The migration tool will capture the objects (tables, views, stored procedures, etc)
   from the selected database. The user can then select the objects that need to be
   migrated. 

 * UC-6: Edit mapping errors 
   The user will be displayed a list of object mapping errors. The user will get the
   option of manually editing the sql statements to resolve the issue.

 * UC-7: Database Object Mapping 
   Upon capturing the database schema, the system performs object mapping based on the
   target database. 

 * UC-8: Capture Source Database Data 
   The system will capture data from source database. 

 * UC-9: Data Conversion 
   The system will convert and map data from the source database to the target database by
   transforming SQL statements. 

 * UC-10: Data Migration 
   The system will capture and migrate data from the source database to the target
   database. Based on the list of tables captured from the source database (UC-14), the
   migration tool captures the data in the tables, and re-creates them in the target 
   database. 

 * UC-11: Error Handling 
   To handle any errors during the migration process such as errors in connecting to the
   source or target DBMS, errors in connecting to the database account and  mapping errors
   which includes, object mapping errors, lack of support for specific object and data
   type mapping errors. 

 * UC-12: Summary Report 
   The system will create a summary report after migration is complete. The summary report
   will include details about the objects that were migrated, and successful or
   unsuccessful migration results. 

 * UC-13: Establish Connection 
   The system will establish connection with the Source or Target DBMS. 

 * UC-14: Capture Source Database Schema 
   The system will capture schema from the source database. This will include listing of
   tables, views, indexes and stored procedures. 

 * UC-15: Schema Migration 
   Having the list of database objects in hand (UC-14), the migration tool will then  
   captures the structure of each object, provide necessary mappings(UC-7) based on the
   target database and then re-create the database schema on the target database.  

The Migration tool will be designed to migrate the following objects:

 * Tables 
 * Views 
 * Indexes 
 * Stored Procedures 

The Migration tool will not handle migration of the following objects:

 * Functions 
 * Triggers 

= High Level Design =
The system will be implemented in Java language using Java classes and interfaces. The following
class diagram exhibits the relationship among classes and interfaces of the MySQL to Derby
Migration tool: 

http://www-scf.usc.edu/~moazeni/GSoC/mysql_2.gif

== Approach 1: Using DatabaseMetaData ==

=== Schema Migration ===

A schema is a collection of components and database objects under the control of a given database
user. The task for schema migration is to capture the schema and its components and database
objects and then proceed with migration. 

To migrate a database you must obtain the catalog information about the database as well as
list of the tables in the database and a list of all columns within each table. That information
lets you create the SQL statements necessary to migrate the data. To do this, Java Database
Connectivity (JDBC) can be used. Most databases provide a system table that will give you
a list of the tables. However because the name of the system table varies from one database
to another, the most compatible way to list and capture information about database objects
is to use the Java class DatabaseMetaData. This class provides information about the database,
such as the names of the tables and column information, names of views, procedure information,
etc. 

The DatabaseMetaData class has many properties and methods. For example, for obtaining the
table information, the getTables() method can be used to return the list of tables in ResultSet
format. After obtaining the list of tables, a list of the columns in that table should be
obtained. Doing that lets you create a proper CREATE TABLE statement, as well as INSERT and
SELECT statements to match the current table. The process of obtaining a list of columns is
through the use of getColumns() method.  After capturing the database catalog information
(tables and columns), the required SQL statements can be generated. The same process can be
used for capturing and recreating views, indexes, and procedures except that a different method
of DatabaseMetaData may be used to capture the information. 

Any data type mappings needs to be done when re-creating the SQL statements as per 
table 1. For example the following SQL statement from MySQL:
  
    || CREATE TABLE EXAMPLE ( id       TINYINT); ||
    
    Needs to be changed to
    
    || CREATE TABLE EXAMPLE ( id       SMALLINT); ||

Considering the syntax of creating a view, we can see that having a query is necessary to
create a view. Therefore, in addition to data types that needs to be mapped, join statements
also need to be considered when creating views. The problem is that some MySQL join syntax
are not supported by Apache Derby. The following are examples of join operations and their
compatible syntax in Apache Derby: 
 
CROSS JOIN 
|| Derby || SELECT * FROM tab1 CROSS JOIN tab2 ||
|| MySQL || SELECT * FROM tab1, tab2 ||

INNER JOIN
|| Derby || SELECT * FROM tab1 INNER JOIN tab2 USING(id) ||
|| MySQL || SELECT * FROM tab1 INNER JOIN tab2 ON tab1.id=tab2.id ||

LEFT OUTER JOIN 
|| Derby || SELECT * FROM tab1 LEFT OUTER JOIN tab2 USING (id) ||
|| MySQL || SELECT * FROM tab1 LEFT OUTER JOIN tab2 ON tab1.id = tab2.id ||
 
RIGHT OUTER JOIN 
|| Derby || SELECT * FROM tab1 RIGHT OUTER JOIN tab2 USING (id) ||
|| MySQL || SELECT * FROM tab1 RIGHT OUTER JOIN tab2 ON tab1.id = tab2.id ||
 
In addition to join syntax, the reference manuals of both MySQL and Derby needs to be consulted
for proper syntax during migration to Apache Derby. 

=== Data Migration ===

In this section, the process of data migration from MySQL to Apache Derby database will be
discussed. However, before doing this, there is a need to find out the differences between
the MySQL and Apache Derby structure. The table below shows the compatibility between built
in data types. Some data types have a one to one correspondence while others require more
consideration before migrating. The tables include information on the following: 

 * Numeric Types 
 * Date and Time Types 
 * String Types 

===== Numeric Types: =====

http://www-scf.usc.edu/~moazeni/GSoC/numeric.jpg

===== Date and Time Types: =====

http://www-scf.usc.edu/~moazeni/GSoC/datetime.jpg

===== String Types: =====

http://www-scf.usc.edu/~moazeni/GSoC/string.jpg

Table 1: Data type Mapping from MySQL to Apache Derby


To manipulate and migrate the data, Apache Derby supports INSERT, UPDATE and DELETE statements
with similar syntax and behavior as MySQL. These statements can be used programmatically to
migrate the data  

This step is usually done after schema migration most importantly when tables are migrated.
To do that, both SELECT and INSERT statements must be created. The SELECT statement will read
the table data from the source database. The INSERT statement will write the data to the target
database. 

The task involves retrieving columns information from source database and re-creates the SELECT
and INSERT statements. The SELECT statement can then be executed and a list of records will
be returned. Using the returned list of records, the INSERT statement can be created by inserting
the individual column data and then executing the INSERT statement. Any data type mappings
needs to be done when re-creating the INSERT statement as per the above tables. 

== Approach 2: Using DdlUtils ==
 
Another approach would be to use DdlUtils. DdlUtils is a component for working with Database
Definition (DDL) files. DDL files are XML files that contain the definition of a database
schema, e.g. tables and columns. These files can also be fed into DdlUtils in order to create
the corresponding database. An example of such a DDL file is: 

{{{ <?xml version="1.0"?>
<!DOCTYPE database SYSTEM "http://db.apache.org/torque/dtd/database.dtd">
  <table name="book">
    <column name="book_id"
            type="INTEGER"
            required="true"
            primaryKey="true"
            autoIncrement="true"/>
    <column name="isbn"
            type="VARCHAR"
            size="15"
            required="true"/>
    <column name="author_id"
            type="INTEGER"
            required="true"/>
    <column name="title"
            type="VARCHAR"
            size="255"
            defaultValue="N/A"
            required="true"/>
    <foreign-key foreignTable="author">
      <reference local="author_id" foreign="author_id"/>
    </foreign-key>  
    <index name="book_isbn">
      <index-column name="isbn"/>
    </index>
  </table>
</database> }}}


In order to use DdlUtils component to migrate the source database to the target database,
following steps are suggested: 

 * Capture DDL file from DdlUtils. 
 * Modify the source DDL to the target DDL using Serialization and Deserialization of XML.

 * Fed in the new DDL file to the DdlUtils. 

In order to easily work with XML files, the use of Serialization and Deserialization of XML
is suggested. XML files can be deserialized to Java objects. Therefore, we are dealing with
objects afterwards. After modifying the value of object attributes based on datatype mappings
of table 1, objects can be serialized to XML files and be used by the DdlUtils to create the
corresponding database.

Please note that the DdlUtils doesn’t support non-table entities, constraints, triggers,
stored procedures, etc. A possible solution would be to execute database-specific commands
to capture required information.

= Initial Prototypes =

The migration tool will provide both graphical user interface as well as command line interface.
I propose the following way of performing activities using the Graphical User Interface that
the system provides:

===== Welcome Screen: =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_1.jpg

===== Selecting Source and Target Database: =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_2.jpg

===== Source Database Schema Selection =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_3.jpg

===== Select Objects to be Migrated: =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_4.jpg

===== Object Mapping Errors =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_5.jpg

===== Migration Complete =====
http://www-scf.usc.edu/~moazeni/GSoC/mysql_8.jpg

Mime
View raw message