polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanislav Muhametsin <stanislav.muhamet...@zest.mail.kapsi.fi>
Subject Re: JPA?
Date Wed, 21 Sep 2016 09:00:02 GMT
On 21.9.2016 0:08, Jiri Jetmar wrote:
> Independently of that that, things starts to be complicated in the SQL
> world with large data when you are submitting e.g. a inner JOIN statement
> in a transactional INSERT expression, where the tables are located  (the
> data) on different nodes, simply because of a single node limit.

Well... there are ways to circumvent this, too. :)
Unfortunately, can't tell much more about that, as it is key concept of 
the company where I work right now.
But very large and globally distributed RDBMS nodes, which don't lag 
under high stress, are very much so possible!

> I would like to work on this task. Therefore smart ideas are highly welcome
> ! :-)

I am responsible for current SQL indexing code, and Paul is responsible 
for current SQL entitystore code.
The indexing code is, unfortunately, quite spaghettified, and I would do 
lots of things probably very differently now, than how I did them 
several years ago.

I think one strategy to approach would be to re-write whole thing from 
scratch, but keeping the main principles:
1. Each property and association should be stored in separate table. 
This is required for the very dynamic nature of Zest queries.
     This implicates that the whole DB schema will be owned by this 
indexing service - it is not compatible with some external tool 
generating tables (unless tables are, of course, in this exact required 
format).

     Example: if you have interface MyEntity { Property<String> 
myProp(); Association<SomeOtherEntity> myAsso(); },
     That would end up with 3 tables:
     1.1. The table for implicit 'identity' property. Current indexing 
code has separate DB-specific entity primary key which is of type 'long' 
(64bit integer), since that makes foreign key references much faster and 
compacter. Not sure if that is good idea anymore.
     1.2. The table for 'myProp' property with 2 columns: foreign key 
reference to identity table and actual column of type 'text' containing 
the contents of 'myProp' property.
     1.3. The table for 'myAsso' association with 2 columns: foreign key 
reference to identity table of 'MyEntity' and foreign key reference to 
identity table of 'SomeOtherEntity'. Obviously this covers the 
'ManyAssociation's as well, if they are still present in Zest.

2. The SQL query will then be created by translating Zest query objects 
into SQL query objects. You could also just build SQL strings, but it 
would be quite nasty and most likely very unmaintaineable task - current 
code uses java-sql-generator, which is my library on github, to create 
SQL statements using DSL, and not by concatenating strings.
     IIRC Zest AND-conditions are SQL INTERSECTIONs of single 
property/asso queries, Zest OR-conditions are SQL UNIONs of single 
property/asso queries, and Zest NOT-conditions are SQL NOTs of single 
property/asso queries.

3. Keeping DB schema in sync with Zest application schema.
     This is very tough one - if someone adds/removes 
properties/associations, the database schema should be kept up-to-date.
     I remember having problems with this one - it also implies that you 
have to walk through the *whole* application structure in order to 
generate tables for all the properties/assos of all the entities visible 
to this indexing service.


Mime
View raw message