polygene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanislav Muhametsin <stanislav.muhamet...@zest.mail.kapsi.fi>
Subject Re: JPA?
Date Wed, 21 Sep 2016 09:59:13 GMT
On 21.9.2016 12:00, Stanislav Muhametsin wrote:
> On 21.9.2016 0:08, Jiri Jetmar wrote:
>> Independently of that that, things starts to be complicated in the SQL
>> world with large data when you are submitting e.g. a inner JOIN 
>> statement
>> in a transactional INSERT expression, where the tables are located  (the
>> data) on different nodes, simply because of a single node limit.
> Well... there are ways to circumvent this, too. :)
> Unfortunately, can't tell much more about that, as it is key concept 
> of the company where I work right now.
> But very large and globally distributed RDBMS nodes, which don't lag 
> under high stress, are very much so possible!
>> I would like to work on this task. Therefore smart ideas are highly 
>> welcome
>> ! :-)
> I am responsible for current SQL indexing code, and Paul is 
> responsible for current SQL entitystore code.
> The indexing code is, unfortunately, quite spaghettified, and I would 
> do lots of things probably very differently now, than how I did them 
> several years ago.
> I think one strategy to approach would be to re-write whole thing from 
> scratch, but keeping the main principles:
> 1. Each property and association should be stored in separate table. 
> This is required for the very dynamic nature of Zest queries.
>     This implicates that the whole DB schema will be owned by this 
> indexing service - it is not compatible with some external tool 
> generating tables (unless tables are, of course, in this exact 
> required format).
>     Example: if you have interface MyEntity { Property<String> 
> myProp(); Association<SomeOtherEntity> myAsso(); },
>     That would end up with 3 tables:
>     1.1. The table for implicit 'identity' property. Current indexing 
> code has separate DB-specific entity primary key which is of type 
> 'long' (64bit integer), since that makes foreign key references much 
> faster and compacter. Not sure if that is good idea anymore.
>     1.2. The table for 'myProp' property with 2 columns: foreign key 
> reference to identity table and actual column of type 'text' 
> containing the contents of 'myProp' property.
>     1.3. The table for 'myAsso' association with 2 columns: foreign 
> key reference to identity table of 'MyEntity' and foreign key 
> reference to identity table of 'SomeOtherEntity'. Obviously this 
> covers the 'ManyAssociation's as well, if they are still present in Zest.

I forgot one thing to mention: the collectionized properties were cause 
of major headache in SQL indexing.
IIRC I made separate tables and/or columns to model collections 
properly, in order to properly support queries which had conditions like 
"property X is list containing entity Y with property Z at value Ö".

However, with the JSON datatype, it might be better (?) idea to make 
collectionized properties as single column with JSON datatype, and just 
JSON-ize collections.

> 2. The SQL query will then be created by translating Zest query 
> objects into SQL query objects. You could also just build SQL strings, 
> but it would be quite nasty and most likely very unmaintaineable task 
> - current code uses java-sql-generator, which is my library on github, 
> to create SQL statements using DSL, and not by concatenating strings.
>     IIRC Zest AND-conditions are SQL INTERSECTIONs of single 
> property/asso queries, Zest OR-conditions are SQL UNIONs of single 
> property/asso queries, and Zest NOT-conditions are SQL NOTs of single 
> property/asso queries.
> 3. Keeping DB schema in sync with Zest application schema.
>     This is very tough one - if someone adds/removes 
> properties/associations, the database schema should be kept up-to-date.
>     I remember having problems with this one - it also implies that 
> you have to walk through the *whole* application structure in order to 
> generate tables for all the properties/assos of all the entities 
> visible to this indexing service.

View raw message