Mailing-List: contact derby-dev-help@db.apache.org; run by ezmlm
Precedence: bulk
Reply-To: <derby-dev@db.apache.org>
Received-SPF: pass (athena.apache.org: domain of webautomator@gmail.com
 designates 72.14.220.152 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=c9nL//G3U7mBKIKchePqszxIvTFfXJhDYZyFQX8btltLpiFWbRhmjYAjJvBJP9QfOe
         KcuH7Ih/9MVG2Ej1ZPFLUS2VkPlGH+ymAY/nFAGrhloUkrozAlC8cDZiJ1LskaUCTjSh
         3EnwiWkJnrH48V6qOqMihcEpM5EEnsyQzX+mY=
MIME-Version: 1.0
In-Reply-To: <93d36c111003170612t62272f1bl5c5b9e0d907d2f7d@mail.gmail.com>
References: <855911.22501.qm@web29112.mail.ird.yahoo.com>
	 <93d36c111003130115q34926e6j8a4f3aec9651a486@mail.gmail.com>
	 <4B9E4737.8090008@sun.com>
	 <93d36c111003160325g48cbd06ek435aad91bd1720a9@mail.gmail.com>
	 <4B9F89D6.3030604@sun.com> <4B9F9E8A.5090303@gmail.com>
	 <93d36c111003160833x1d40b4bbw7d2c00275e90a75@mail.gmail.com>
	 <4B9FAA82.9090003@gmail.com>
	 <93d36c111003170457y13e98dfbic7a5d2468e7c37e4@mail.gmail.com>
	 <93d36c111003170612t62272f1bl5c5b9e0d907d2f7d@mail.gmail.com>
Date: Wed, 17 Mar 2010 15:15:30 +0200
Message-ID: <93d36c111003170615u270d2127w38c46a7c1b72b427@mail.gmail.com>
Subject: Re: Google Summer of Code
From: =?KOI8-R?B?8c4g8NLPx9LBzc3J09Q=?= <webautomator@gmail.com>
To: Bryan Pendleton <bryanwpendleton@gmail.com>, derby-dev@db.apache.org,
	Rick Hillegas <Richard.Hillegas@sun.com>
Content-Type: multipart/alternative; boundary=001485f770aa9b6bce0481fee860

--001485f770aa9b6bce0481fee860
Content-Type: text/plain; charset=KOI8-R
Content-Transfer-Encoding: quoted-printable

Oh, yeah, the XML type in Oracle is for ORM stuff, that's for shure ;) John

2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 <webautomator@gmail.com>

>
>
> 2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 <webautomator@gmail.co=
m>:
>
> >>  * No actual compatibility tests for Oracle, SQL Server
> >>
> >>
> >>
> >>   I imagine that there are some existing compatibility test suites out
> here. It would be interesting to understand Derby's compatibility with ot=
her
> databases. This sounds like a task >on which a GSoC student could make go=
od
> progress.
> >
> >    Thanks,
> >    -Rick
> >
> Well, some useful compatibility features to focus on:
>
>
>    1. Converting data from one database to another in a respective way
>    against data consistency. The reasons why special software for convert=
ing
>    between different databases is useless. Problem of different actual le=
ngth
>    of data in table cells. Acceptableness of creation and population of d=
ata
>    entries in additional tables with equal column structure for columns (=
read:
>    table schema) for entries with incompatible sizes (for column types of
>    consuming tables in a consuming database)
>    2. Converting data in a entity centric manner. Convert data of similar
>    column types from a point of stored data. If the database client code
>    contain strategy switching logic of how to interpret the state of data=
,
>    described in database, it is possible to populate entries in other tab=
les of
>    consuming database to avoid loss of data consistency
>    3. Converting types is not always good. Compatibility is rather is if
>    consuming database would read entries with a special formatter - to
>    interpret data as it's own type. So you can copy data as binary heap a=
nd
>    force consuming database to enable special formatter to read data even=
 at
>    production enviroment (formatter would rather point on certain points =
of a
>    cell data chunk in each number/word/... of actually represented data t=
o
>    avoid dealing with useless binary delimiters). But that is for data ty=
pes
>    which is for representing the same class (numeric-to-numeric, date-to-=
date).
>    Stored data in cells could be forced to converted to native to avoid u=
sage
>    of special formatter, to use normal formatter when consistency of stat=
e of
>    entities(ORM?), represented in entries is ensured.
>    4. Column type compatibility is not an index compatibility. Source
>    database contains lots of indexes to speed the JOINs. So it is normall=
y
>    required for consuming database to force indexing consumed data entrie=
s.
>    Index simply represent positions of cells, so in different databases i=
t
>    would be different positions for the same data due to cell data size &
>    binary delimiters (a part of storage engine stuff). But delimiters are=
 the
>    same for same storage engine, cell data sizes vary from to cell. Still=
 do
>    not know how to fix a problem - but it would be awesome to apply, for
>    example, MySQL indexes on JavaDB without re-indexing on JavaDB side
>    5. I see a VARBINARY as a candidate for dealing with aggregation in
>    ORM. When you will need a convertion due to any reasons - you can get =
entity
>    data, described in ORM classes with ORM, where aggregated entities are
>    virtualised with referencing columns on either table, you can get entr=
ies
>    from aggregated table (entry of property set of entity) and store it i=
nto
>    VARBINARY of consuming database, and because it is known how the ORM c=
lass
>    entries would be iterated in tables of consumer database - you can rea=
d at
>    consumer side in a binary formatter, and use that data how you want.
>    Convertion of consumer's database VARBINARY to a set of columns as nat=
ive
>    ones for aggregated entity entries is fast enough(?)
>    6. All that numeric precision delaing stuff. Yeap, ODBC NUMERIC type
>    will not have any sence if column compatibility would be implemented
>    (precision auto-dealing when reading from cells with different length =
(count
>    of digits) and float-to-int and other stuff). Can help avoiding some O=
DBC
>    code, as also as JDBC code, both relative to column type handling, whe=
n
>    accessing data from non-native database clients( access JavaDB from C#=
 or
>    SQLite from Java, for example)
>
> *XML persistence driven conversions* of both schema and data and *"hot"
> binary copying* of non-native data from different databases(special
> formatters in (1)) are two main goals here. John
>

--001485f770aa9b6bce0481fee860
Content-Type: text/html; charset=KOI8-R
Content-Transfer-Encoding: quoted-printable

Oh, yeah, the XML type in Oracle is for ORM stuff, that&#39;s for shure ;) =
John<br><br><div class=3D"gmail_quote">2010/3/17 =F1=CE =F0=D2=CF=C7=D2=C1=
=CD=CD=C9=D3=D4 <span dir=3D"ltr">&lt;<a href=3D"mailto:webautomator@gmail.=
com">webautomator@gmail.com</a>&gt;</span><br>
<blockquote class=3D"gmail_quote" style=3D"border-left: 1px solid rgb(204, =
204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br><br>2010/3/17=
 =F1=CE =F0=D2=CF=C7=D2=C1=CD=CD=C9=D3=D4 &lt;<a href=3D"mailto:webautomato=
r@gmail.com" target=3D"_blank">webautomator@gmail.com</a>&gt;:<div>
<div></div><div class=3D"h5"><br>&gt;&gt;=9A * No actual compatibility test=
s for Oracle, SQL Server<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt;<br>&gt;&gt;=9A=
 =9AI imagine that there are some existing compatibility test suites out he=
re. It would be interesting to understand Derby&#39;s compatibility with ot=
her databases. This sounds like a task &gt;on which a GSoC student could ma=
ke good progress.<br>

&gt;<br>&gt; =9A =9AThanks,<br>&gt; =9A =9A-Rick<br>&gt;<br>Well, some usef=
ul compatibility features to focus on:<br><br></div></div><ol><li>Convertin=
g data from one database to another in a respective way against data consis=
tency. The reasons why special software for converting between different da=
tabases is useless. Problem of different actual length of data in table cel=
ls. Acceptableness of creation and population of data entries in additional=
 tables with equal column structure for columns (read: table schema) for en=
tries with incompatible sizes (for column types of consuming tables in a co=
nsuming database)</li>

<li>Converting data in a entity centric manner. Convert data of similar col=
umn types from a point of stored data. If the database client code contain =
strategy switching logic of how to interpret the state of data, described i=
n database, it is possible to populate entries in other tables of consuming=
 database to avoid loss of data consistency</li>

<li>Converting types is not always good. Compatibility is rather is if cons=
uming database would read entries with a special formatter - to interpret d=
ata as it&#39;s own type. So you can copy data as binary heap and force con=
suming database to enable special formatter to read data even at production=
 enviroment (formatter would rather point on certain points of a cell data =
chunk in each number/word/... of actually represented data to avoid dealing=
 with useless binary delimiters). But that is for data types which is for r=
epresenting the same class (numeric-to-numeric, date-to-date). Stored data =
in cells could be forced to converted to native to avoid usage of special f=
ormatter, to use normal formatter when consistency of state of entities(ORM=
?), represented in entries is ensured.</li>

<li>Column type compatibility is not an index compatibility. Source databas=
e contains lots of indexes to speed the JOINs. So it is normally required f=
or consuming database to force indexing consumed data entries. Index simply=
 represent positions of cells, so in different databases it would be differ=
ent positions for the same data due to cell data size &amp; binary delimite=
rs (a part of storage engine stuff). But delimiters are the same for same s=
torage engine, cell data sizes vary from to cell. Still do not know how to =
fix a problem - but it would be awesome to apply, for example, MySQL indexe=
s on JavaDB without re-indexing on JavaDB side</li>

<li>I see a VARBINARY as a candidate for dealing with aggregation in ORM. W=
hen you will need a convertion due to any reasons - you can get entity data=
, described in ORM classes with ORM, where aggregated entities are virtuali=
sed with referencing columns on either table, you can get entries from aggr=
egated table (entry of property set of entity) and store it into VARBINARY =
of consuming database, and because it is known how the ORM class entries wo=
uld be iterated in tables of consumer database - you can read at consumer s=
ide in a binary formatter, and use that data how you want. Convertion of co=
nsumer&#39;s database VARBINARY to a set of columns as native ones for aggr=
egated entity entries is fast enough(?)</li>

<li>All that numeric precision delaing stuff. Yeap, ODBC NUMERIC type will =
not have any sence if column compatibility would be implemented (precision =
auto-dealing when reading from cells with different length (count of digits=
) and float-to-int and other stuff). Can help avoiding some ODBC code, as a=
lso as JDBC code, both relative to column type handling, when accessing dat=
a from non-native database clients( access JavaDB from C# or SQLite from Ja=
va, for example)</li>

</ol><b>XML persistence driven conversions</b> of both schema and data and =
<b>&quot;hot&quot; binary copying</b> of non-native data from different dat=
abases(special formatters in (1)) are two main goals here. John<br>
</blockquote></div><br>

--001485f770aa9b6bce0481fee860--