db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rick Hillegas <Richard.Hille...@Sun.COM>
Subject Re: Google Summer of Code
Date Thu, 18 Mar 2010 19:20:14 GMT
Hi John,

I agree with Bryan that we want to compare Derby's behavior to that of 
other databases so that we can understand

1) where Derby diverges from the SQL Standard

2) what features we should add to Derby to increase its usefulness

As you note, datatypes are an important compatibility topic. I think 
that you are on the right track when you talk about ORM frameworks. I 
think that we would understand a great deal about datatype compatibility 
by studying how the major ORM frameworks map datatypes onto various 
rdbmses. The engineers at Hibernate, Toplink, etc. have done a lot of 
the analysis already. A systematic compilation of their findings would 
be useful.


Ян Программист wrote:
> 2010/3/17 Ян Программист <webautomator@gmail.com 
> <mailto:webautomator@gmail.com>>:
> >>  * No actual compatibility tests for Oracle, SQL Server
> >>
> >>
> >>
> >>   I imagine that there are some existing compatibility test suites 
> out here. It would be interesting to understand Derby's compatibility 
> with other databases. This sounds like a task >on which a GSoC student 
> could make good progress.
> >
> >    Thanks,
> >    -Rick
> >
> Well, some useful compatibility features to focus on:
>    1. Converting data from one database to another in a respective way
>       against data consistency. The reasons why special software for
>       converting between different databases is useless. Problem of
>       different actual length of data in table cells. Acceptableness
>       of creation and population of data entries in additional tables
>       with equal column structure for columns (read: table schema) for
>       entries with incompatible sizes (for column types of consuming
>       tables in a consuming database)
>    2. Converting data in a entity centric manner. Convert data of
>       similar column types from a point of stored data. If the
>       database client code contain strategy switching logic of how to
>       interpret the state of data, described in database, it is
>       possible to populate entries in other tables of consuming
>       database to avoid loss of data consistency
>    3. Converting types is not always good. Compatibility is rather is
>       if consuming database would read entries with a special
>       formatter - to interpret data as it's own type. So you can copy
>       data as binary heap and force consuming database to enable
>       special formatter to read data even at production enviroment
>       (formatter would rather point on certain points of a cell data
>       chunk in each number/word/... of actually represented data to
>       avoid dealing with useless binary delimiters). But that is for
>       data types which is for representing the same class
>       (numeric-to-numeric, date-to-date). Stored data in cells could
>       be forced to converted to native to avoid usage of special
>       formatter, to use normal formatter when consistency of state of
>       entities(ORM?), represented in entries is ensured.
>    4. Column type compatibility is not an index compatibility. Source
>       database contains lots of indexes to speed the JOINs. So it is
>       normally required for consuming database to force indexing
>       consumed data entries. Index simply represent positions of
>       cells, so in different databases it would be different positions
>       for the same data due to cell data size & binary delimiters (a
>       part of storage engine stuff). But delimiters are the same for
>       same storage engine, cell data sizes vary from to cell. Still do
>       not know how to fix a problem - but it would be awesome to
>       apply, for example, MySQL indexes on JavaDB without re-indexing
>       on JavaDB side
>    5. I see a VARBINARY as a candidate for dealing with aggregation in
>       ORM. When you will need a convertion due to any reasons - you
>       can get entity data, described in ORM classes with ORM, where
>       aggregated entities are virtualised with referencing columns on
>       either table, you can get entries from aggregated table (entry
>       of property set of entity) and store it into VARBINARY of
>       consuming database, and because it is known how the ORM class
>       entries would be iterated in tables of consumer database - you
>       can read at consumer side in a binary formatter, and use that
>       data how you want. Convertion of consumer's database VARBINARY
>       to a set of columns as native ones for aggregated entity entries
>       is fast enough(?)
>    6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC
>       type will not have any sence if column compatibility would be
>       implemented (precision auto-dealing when reading from cells with
>       different length (count of digits) and float-to-int and other
>       stuff). Can help avoiding some ODBC code, as also as JDBC code,
>       both relative to column type handling, when accessing data from
>       non-native database clients( access JavaDB from C# or SQLite
>       from Java, for example)
> *XML persistence driven conversions* of both schema and data and 
> *"hot" binary copying* of non-native data from different 
> databases(special formatters in (1)) are two main goals here. John

View raw message