Return-Path: Delivered-To: apmail-db-derby-dev-archive@www.apache.org Received: (qmail 41557 invoked from network); 17 Mar 2010 13:16:00 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 17 Mar 2010 13:16:00 -0000 Received: (qmail 93198 invoked by uid 500); 17 Mar 2010 13:16:00 -0000 Delivered-To: apmail-db-derby-dev-archive@db.apache.org Received: (qmail 93070 invoked by uid 500); 17 Mar 2010 13:16:00 -0000 Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: Delivered-To: mailing list derby-dev@db.apache.org Received: (qmail 93063 invoked by uid 99); 17 Mar 2010 13:16:00 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Mar 2010 13:16:00 +0000 X-ASF-Spam-Status: No, hits=1.3 required=10.0 tests=AWL,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of webautomator@gmail.com designates 72.14.220.152 as permitted sender) Received: from [72.14.220.152] (HELO fg-out-1718.google.com) (72.14.220.152) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 17 Mar 2010 13:15:52 +0000 Received: by fg-out-1718.google.com with SMTP id e21so516632fga.13 for ; Wed, 17 Mar 2010 06:15:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=N+zHGs60sszng9VWO5T3tqvd9RvjV+o7BaVbxFgNIO0=; b=vx+EWMtV9h5cJnoj3gNIsdfnoyc1MH/LLzgQn5QqWh4CNrM0lQxmqywB8TWW6gRQjh xN1DfLbDbRujBs5yZ/IGQZ1AypDYQbpzE4+Rt4bVIdN8lkWCbkxJOPniUbtGXBm/37/A mPW+5aWFWkggijSvkRdOwZUJUNwenBs0eKl78= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=c9nL//G3U7mBKIKchePqszxIvTFfXJhDYZyFQX8btltLpiFWbRhmjYAjJvBJP9QfOe KcuH7Ih/9MVG2Ej1ZPFLUS2VkPlGH+ymAY/nFAGrhloUkrozAlC8cDZiJ1LskaUCTjSh 3EnwiWkJnrH48V6qOqMihcEpM5EEnsyQzX+mY= MIME-Version: 1.0 Received: by 10.87.44.8 with SMTP id w8mr975879fgj.16.1268831731017; Wed, 17 Mar 2010 06:15:31 -0700 (PDT) In-Reply-To: <93d36c111003170612t62272f1bl5c5b9e0d907d2f7d@mail.gmail.com> References: <855911.22501.qm@web29112.mail.ird.yahoo.com> <93d36c111003130115q34926e6j8a4f3aec9651a486@mail.gmail.com> <4B9E4737.8090008@sun.com> <93d36c111003160325g48cbd06ek435aad91bd1720a9@mail.gmail.com> <4B9F89D6.3030604@sun.com> <4B9F9E8A.5090303@gmail.com> <93d36c111003160833x1d40b4bbw7d2c00275e90a75@mail.gmail.com> <4B9FAA82.9090003@gmail.com> <93d36c111003170457y13e98dfbic7a5d2468e7c37e4@mail.gmail.com> <93d36c111003170612t62272f1bl5c5b9e0d907d2f7d@mail.gmail.com> Date: Wed, 17 Mar 2010 15:15:30 +0200 Message-ID: <93d36c111003170615u270d2127w38c46a7c1b72b427@mail.gmail.com> Subject: Re: Google Summer of Code From: =?KOI8-R?B?8c4g8NLPx9LBzc3J09Q=?= To: Bryan Pendleton , derby-dev@db.apache.org, Rick Hillegas Content-Type: multipart/alternative; boundary=001485f770aa9b6bce0481fee860 --001485f770aa9b6bce0481fee860 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable Oh, yeah, the XML type in Oracle is for ORM stuff, that's for shure ;) John 2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 > > > 2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 : > > >> * No actual compatibility tests for Oracle, SQL Server > >> > >> > >> > >> I imagine that there are some existing compatibility test suites out > here. It would be interesting to understand Derby's compatibility with ot= her > databases. This sounds like a task >on which a GSoC student could make go= od > progress. > > > > Thanks, > > -Rick > > > Well, some useful compatibility features to focus on: > > > 1. Converting data from one database to another in a respective way > against data consistency. The reasons why special software for convert= ing > between different databases is useless. Problem of different actual le= ngth > of data in table cells. Acceptableness of creation and population of d= ata > entries in additional tables with equal column structure for columns (= read: > table schema) for entries with incompatible sizes (for column types of > consuming tables in a consuming database) > 2. Converting data in a entity centric manner. Convert data of similar > column types from a point of stored data. If the database client code > contain strategy switching logic of how to interpret the state of data= , > described in database, it is possible to populate entries in other tab= les of > consuming database to avoid loss of data consistency > 3. Converting types is not always good. Compatibility is rather is if > consuming database would read entries with a special formatter - to > interpret data as it's own type. So you can copy data as binary heap a= nd > force consuming database to enable special formatter to read data even= at > production enviroment (formatter would rather point on certain points = of a > cell data chunk in each number/word/... of actually represented data t= o > avoid dealing with useless binary delimiters). But that is for data ty= pes > which is for representing the same class (numeric-to-numeric, date-to-= date). > Stored data in cells could be forced to converted to native to avoid u= sage > of special formatter, to use normal formatter when consistency of stat= e of > entities(ORM?), represented in entries is ensured. > 4. Column type compatibility is not an index compatibility. Source > database contains lots of indexes to speed the JOINs. So it is normall= y > required for consuming database to force indexing consumed data entrie= s. > Index simply represent positions of cells, so in different databases i= t > would be different positions for the same data due to cell data size & > binary delimiters (a part of storage engine stuff). But delimiters are= the > same for same storage engine, cell data sizes vary from to cell. Still= do > not know how to fix a problem - but it would be awesome to apply, for > example, MySQL indexes on JavaDB without re-indexing on JavaDB side > 5. I see a VARBINARY as a candidate for dealing with aggregation in > ORM. When you will need a convertion due to any reasons - you can get = entity > data, described in ORM classes with ORM, where aggregated entities are > virtualised with referencing columns on either table, you can get entr= ies > from aggregated table (entry of property set of entity) and store it i= nto > VARBINARY of consuming database, and because it is known how the ORM c= lass > entries would be iterated in tables of consumer database - you can rea= d at > consumer side in a binary formatter, and use that data how you want. > Convertion of consumer's database VARBINARY to a set of columns as nat= ive > ones for aggregated entity entries is fast enough(?) > 6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC type > will not have any sence if column compatibility would be implemented > (precision auto-dealing when reading from cells with different length = (count > of digits) and float-to-int and other stuff). Can help avoiding some O= DBC > code, as also as JDBC code, both relative to column type handling, whe= n > accessing data from non-native database clients( access JavaDB from C#= or > SQLite from Java, for example) > > *XML persistence driven conversions* of both schema and data and *"hot" > binary copying* of non-native data from different databases(special > formatters in (1)) are two main goals here. John > --001485f770aa9b6bce0481fee860 Content-Type: text/html; charset=KOI8-R Content-Transfer-Encoding: quoted-printable Oh, yeah, the XML type in Oracle is for ORM stuff, that's for shure ;) = John

2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1= =CD=CD=C9=D3=D4 <webautomator@gmail.com>


2010/3/17= =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 <webautomator@gmail.com>:

>>=9A * No actual compatibility test= s for Oracle, SQL Server
>>
>>
>>
>>=9A= =9AI imagine that there are some existing compatibility test suites out he= re. It would be interesting to understand Derby's compatibility with ot= her databases. This sounds like a task >on which a GSoC student could ma= ke good progress.
>
> =9A =9AThanks,
> =9A =9A-Rick
>
Well, some usef= ul compatibility features to focus on:

  1. Convertin= g data from one database to another in a respective way against data consis= tency. The reasons why special software for converting between different da= tabases is useless. Problem of different actual length of data in table cel= ls. Acceptableness of creation and population of data entries in additional= tables with equal column structure for columns (read: table schema) for en= tries with incompatible sizes (for column types of consuming tables in a co= nsuming database)
  2. Converting data in a entity centric manner. Convert data of similar col= umn types from a point of stored data. If the database client code contain = strategy switching logic of how to interpret the state of data, described i= n database, it is possible to populate entries in other tables of consuming= database to avoid loss of data consistency
  3. Converting types is not always good. Compatibility is rather is if cons= uming database would read entries with a special formatter - to interpret d= ata as it's own type. So you can copy data as binary heap and force con= suming database to enable special formatter to read data even at production= enviroment (formatter would rather point on certain points of a cell data = chunk in each number/word/... of actually represented data to avoid dealing= with useless binary delimiters). But that is for data types which is for r= epresenting the same class (numeric-to-numeric, date-to-date). Stored data = in cells could be forced to converted to native to avoid usage of special f= ormatter, to use normal formatter when consistency of state of entities(ORM= ?), represented in entries is ensured.
  4. Column type compatibility is not an index compatibility. Source databas= e contains lots of indexes to speed the JOINs. So it is normally required f= or consuming database to force indexing consumed data entries. Index simply= represent positions of cells, so in different databases it would be differ= ent positions for the same data due to cell data size & binary delimite= rs (a part of storage engine stuff). But delimiters are the same for same s= torage engine, cell data sizes vary from to cell. Still do not know how to = fix a problem - but it would be awesome to apply, for example, MySQL indexe= s on JavaDB without re-indexing on JavaDB side
  5. I see a VARBINARY as a candidate for dealing with aggregation in ORM. W= hen you will need a convertion due to any reasons - you can get entity data= , described in ORM classes with ORM, where aggregated entities are virtuali= sed with referencing columns on either table, you can get entries from aggr= egated table (entry of property set of entity) and store it into VARBINARY = of consuming database, and because it is known how the ORM class entries wo= uld be iterated in tables of consumer database - you can read at consumer s= ide in a binary formatter, and use that data how you want. Convertion of co= nsumer's database VARBINARY to a set of columns as native ones for aggr= egated entity entries is fast enough(?)
  6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC type will = not have any sence if column compatibility would be implemented (precision = auto-dealing when reading from cells with different length (count of digits= ) and float-to-int and other stuff). Can help avoiding some ODBC code, as a= lso as JDBC code, both relative to column type handling, when accessing dat= a from non-native database clients( access JavaDB from C# or SQLite from Ja= va, for example)
XML persistence driven conversions of both schema and data and = "hot" binary copying of non-native data from different dat= abases(special formatters in (1)) are two main goals here. John

--001485f770aa9b6bce0481fee860--