ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Setrakyan <dsetrak...@apache.org>
Subject Re: DDL implementation details
Date Wed, 18 Jan 2017 03:29:27 GMT
After looking at all the restrictions of "cache==schema" approach, I still
think that it would be the most natural match for Ignite. All other
suggestions are either too limiting, too hard, or not applicable for
Ignite.

My vote would be to start with "cache==schema" and get some user feedback.
We can always add "tablespaces" in future, if our users demand it.

D.

On Tue, Jan 17, 2017 at 1:29 AM, Sergi Vladykin <sergi.vladykin@gmail.com>
wrote:

> Sergey,
>
> See inline:
>
> 2017-01-17 0:50 GMT+03:00 Sergey Kozlov <skozlov@gridgain.com>:
>
> > For the approach schema==cache it is not clear how it works for some
> cases
> > (especialy if SQL and regular cache operations are mixed):
> >
> > 1. I want to store two tables in same cache (same schema) and both tables
> > have identical structure like id Integer, name String.
> > How they will be processed for regular cache operations
> > put/putall/get/getall etc where we get cache instance by its name (on SQL
> > level we use the table name) ?
> >
>
> This already works. We can have multiple QueryEntities configured for a
> single cache. The only restriction here is that cache key must be unique
> across all the tables inside the same schema.
>
>
> >
> > 2. What's about ALTER TABLE statement? Will the changes for one table
> lock
> > all tables in the cache?
> >
>
> It will not be needed, any table structure modification will be performed
> in the table scope.
>
>
> > 3. In the future if we will introduce table-based features like SQL roles
> > (as next step of SQL implementation) "many tables one cache" rule looks
> > like more difficult for its implementation.
> >
>
> I don't think so, must be the same thing.
>
>
> >
> > p.s. may be we need to think about new entity called "table" (based on
> type
> > descriptor) inside cache to better support of SQL DDL
> >
>
> We already have it: QueryEntity. It has exactly that semantics.
>
> Sergi
>
>
> >
> >
> > On Mon, Jan 16, 2017 at 11:06 PM, Sergi Vladykin <
> sergi.vladykin@gmail.com
> > >
> > wrote:
> >
> > > Dima,
> > >
> > > I agree that cache==table is definitely a wrong choice, but as far as I
> > see
> > > Vova suggests having cache==tablespace instead of cache==schema. I tend
> > to
> > > agree with this decoupling of physical and logical grouping, but the
> > > concern is that it will require much more work to do.
> > >
> > > Sergi
> > >
> > > 2017-01-16 21:35 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> > >
> > > > Vova,
> > > >
> > > > Currently I see only 2 ways we can proceed here:
> > > >
> > > >    1. cache == table
> > > >    2. cache == schema
> > > >
> > > > I agree that "cache==table" may be more flexible, but I don't think
> it
> > > will
> > > > work in Ignite.
> > > > We may end up with 1,000s of caches, which will carry significant
> > > overhead
> > > > on memory and cluster overall. I think that we have no choice but to
> > take
> > > > "cache==schema" approach.
> > > >
> > > > D.
> > > >
> > > > On Mon, Jan 16, 2017 at 1:00 AM, Vladimir Ozerov <
> vozerov@gridgain.com
> > >
> > > > wrote:
> > > >
> > > > > Sergi, Dima,
> > > > >
> > > > > In the scope of Ignite 1.x it is perfectly fine to have "schema =
> > > cache".
> > > > > Nobody suffers from it because nobody use Ignite as database. But
> in
> > > > > future, thanks to page memory, we are going to target real database
> > use
> > > > > cases. Users will have multiple tables in Ignite. Plus views,
> > triggers,
> > > > > constraints, etc.. All these features are very useful and easy to
> > > > implement
> > > > > provided that we already have table and index implementations. And
> in
> > > > > databases all related objects are *logically *grouped in a
> "schema".
> > > This
> > > > > is convenient for users: less boilerplate in SQL, better
> > manageability
> > > > > (remember that database users will definitely need some console
> > and/or
> > > UI
> > > > > tools to manage Ignite as a database).
> > > > >
> > > > > What you offer is to group database objects *physically *rather
> than
> > > > > logically. It will lead to:
> > > > > - Boilerplate in queries
> > > > > - Inconvenient database management. All the things database users
> are
> > > > used
> > > > > to - import/export tools, UIs, "USING" keyword, etc, will look
> weird
> > in
> > > > > Ignite as there will be no way to group arbitrary objects
> logically.
> > > > >
> > > > > With this approach almost every user will have to use two schemes
> > > instead
> > > > > of one - one for operational data (PARTITIONED) and one for
> reference
> > > > data
> > > > > (REPLICATED). No conventional database works this way.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > On Fri, Jan 13, 2017 at 9:18 PM, Dmitriy Setrakyan <
> > > > dsetrakyan@apache.org>
> > > > > wrote:
> > > > >
> > > > > > Vova,
> > > > > >
> > > > > > I will join Sergi here. It seems like "schema = cache" will
take
> > care
> > > > of
> > > > > > all different configuration properties required for different
> > groups
> > > of
> > > > > > caches. In addition, it cleanly maps into current Ignite
> > > architecture.
> > > > We
> > > > > > will need to have a very strong reason to move away from it.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Fri, Jan 13, 2017 at 2:39 AM, Vladimir Ozerov <
> > > vozerov@gridgain.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Correct, it worked, because Ignite has never had real database
> > use
> > > > case
> > > > > > in
> > > > > > > mind. Unfortunately, if our global plans go as expected,
it
> will
> > > not
> > > > > work
> > > > > > > for Ignite 2.x+.
> > > > > > >
> > > > > > > On Fri, Jan 13, 2017 at 11:53 AM, Sergi Vladykin <
> > > > > > sergi.vladykin@gmail.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Lets move on with SQL schema == Ignite cache. It worked
> always
> > > like
> > > > > > > this, I
> > > > > > > > see no reasons to change this.
> > > > > > > >
> > > > > > > > Sergi
> > > > > > > >
> > > > > > > > 2017-01-13 11:20 GMT+03:00 Vladimir Ozerov <
> > vozerov@gridgain.com
> > > >:
> > > > > > > >
> > > > > > > > > "Tablespace" (Oracle, PostgreSQL) is what maps
better than
> > > > "schema"
> > > > > > to
> > > > > > > > our
> > > > > > > > > cache. But not ideally still.
> > > > > > > > >
> > > > > > > > > On Fri, Jan 13, 2017 at 11:10 AM, Vladimir Ozerov
<
> > > > > > > vozerov@gridgain.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Alex,
> > > > > > > > > >
> > > > > > > > > > Currently Ignite is not used as database.
It is used as
> > > search
> > > > > > > engine -
> > > > > > > > > > several types, several tables, several joins.
This is why
> > > > having
> > > > > > > > "SCHEMA
> > > > > > > > > ==
> > > > > > > > > > cache" was never a problem. Users have never
build
> complex
> > > SQL
> > > > > > > > > applications
> > > > > > > > > > on top of Ignite. But we are going towards
database. And
> my
> > > > > > question
> > > > > > > > > stands
> > > > > > > > > > still - suppose it is Y2019, how is user
going to migrate
> > his
> > > > > > > database
> > > > > > > > > > containing 20-30-50-100 tables in a single
schema in
> Oracle
> > > to
> > > > > > > Ignite?
> > > > > > > > > >
> > > > > > > > > > Single cache for all tables? Doens't work
- not flexible.
> > > Users
> > > > > > will
> > > > > > > > > > definitely require different cache modes,
different
> > > co-location
> > > > > > > rules,
> > > > > > > > > > different number of backups, etc..
> > > > > > > > > > Schema per table? Doesn't work either -
unmanageable and
> > not
> > > > > > > convenient
> > > > > > > > > > for users even for relatively small databases.
> > > > > > > > > >
> > > > > > > > > > From user perspective schema is logical
grouping of
> > database
> > > > > > objects,
> > > > > > > > > > nothing more.
> > > > > > > > > >
> > > > > > > > > > For Ignite schema could be a logical group
of resources
> > > (nodes,
> > > > > > > memory
> > > > > > > > > > pools, caches, etc.). And multiple tables
over multiple
> > > caches
> > > > > > should
> > > > > > > > > > reside in it. To the contrast, table definition
governs
> how
> > > > data
> > > > > is
> > > > > > > > > stored.
> > > > > > > > > > This is similar to, for example, MySQL approach,
where
> you
> > > > define
> > > > > > how
> > > > > > > > you
> > > > > > > > > > store data on per-table level, and on schema
level you
> > define
> > > > > only
> > > > > > > > minor
> > > > > > > > > > things like collation.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Jan 13, 2017 at 10:33 AM, Alexander
Paschenko <
> > > > > > > > > > alexander.a.paschenko@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > >> Vova,
> > > > > > > > > >>
> > > > > > > > > >> 2017-01-13 4:56 GMT+08:00 Vladimir Ozerov
<
> > > > vozerov@gridgain.com
> > > > > >:
> > > > > > > > > >> > I am not quite sure I understand
the idea of "SCHEMA
> ==
> > > > > cache".
> > > > > > > > > Consider
> > > > > > > > > >> > some small database with, say,
~30 tables. And user
> > wants
> > > to
> > > > > > > migrate
> > > > > > > > > to
> > > > > > > > > >> > Ignite. How is he supposed to do
so? 30 schemas
> leading
> > to
> > > > > > rewrite
> > > > > > > > of
> > > > > > > > > >> all
> > > > > > > > > >> > his SQL scripts? Or 30 key-value
pairs in a single
> cache
> > > > > leading
> > > > > > > to
> > > > > > > > > >> lack of
> > > > > > > > > >> > flexibility and performance problems?
> > > > > > > > > >>
> > > > > > > > > >> But currently schema *is* semantically
equal to cache
> > while
> > > > > table
> > > > > > is
> > > > > > > > > >> equal to type descriptor (i.e. type
of stored entities),
> > > > nothing
> > > > > > new
> > > > > > > > > >> here.
> > > > > > > > > >>
> > > > > > > > > >> Say, in single cache we may have entities
of types
> Person
> > > and
> > > > > > > > > >> Organization, those map to two tables
with same names,
> and
> > > can
> > > > > be
> > > > > > > > > >> accessed within the same cache (i.e.
schema).
> > > > > > > > > >>
> > > > > > > > > >> If we want to limit the user with having
single type
> > > > descriptor
> > > > > > per
> > > > > > > > > >> cache (i.e. cache has only one type
of stored entities -
> > > BTW,
> > > > > > where
> > > > > > > we
> > > > > > > > > >> are with this 2.0-wise?), then this
notion could change.
> > But
> > > > > > > currently
> > > > > > > > > >> what has been suggested already fits
quite good with
> what
> > we
> > > > do
> > > > > > have
> > > > > > > > > >> at the moment regarding semantic of
SQL objects.
> > > > > > > > > >>
> > > > > > > > > >> - Alex
> > > > > > > > > >>
> > > > > > > > > >> > Another example is how to deal
with referene tables?
> > Lots
> > > > > > database
> > > > > > > > has
> > > > > > > > > >> > small reference tables which is
best to fit REPLICATED
> > > > cache,
> > > > > > > while
> > > > > > > > > >> others
> > > > > > > > > >> > are usually bound to PARTITIONED
mode. "SCHEMA ==
> cache"
> > > > will
> > > > > > > force
> > > > > > > > > >> users
> > > > > > > > > >> > to split them into separate schemes
leading to poor
> user
> > > > > > > experience.
> > > > > > > > > >> >
> > > > > > > > > >> > I understand that we may have some
implementation
> > details
> > > > > around
> > > > > > > it
> > > > > > > > at
> > > > > > > > > >> the
> > > > > > > > > >> > moment. But from user perspective
"SCHEMA == cache"
> > > doesn't
> > > > > make
> > > > > > > > > sense.
> > > > > > > > > >> As
> > > > > > > > > >> > we are going towards AI 2.0 we'd
better to rethink
> this
> > > > > > approach.
> > > > > > > > > >> >
> > > > > > > > > >> > On Thu, Jan 12, 2017 at 11:46 PM,
Denis Magda <
> > > > > > dmagda@apache.org>
> > > > > > > > > >> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> >>
> > > > > > > > > >> >> > On Jan 12, 2017, at 12:35
PM, Dmitriy Setrakyan <
> > > > > > > > > >> dsetrakyan@apache.org>
> > > > > > > > > >> >> wrote:
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > On Thu, Jan 12, 2017 at
9:47 AM, Sergi Vladykin <
> > > > > > > > > >> >> sergi.vladykin@gmail.com>
> > > > > > > > > >> >> > wrote:
> > > > > > > > > >> >> >
> > > > > > > > > >> >> >> The xml config was
only for example. We can put in
> > > this
> > > > > > > > > >> configuration
> > > > > > > > > >> >> >> string cache config
parameters directly like this:
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >> CREATE SCHEMA "MyCacheName"
WITH
> > > > > > > > > >> >> >> "cacheMode=REPLICATED;atomicityMode=ATOMIC"
> > > > > > > > > >> >> >>
> > > > > > > > > >> >> >
> > > > > > > > > >> >> > This approach makes sense,
if it can be easily
> > > supported
> > > > > with
> > > > > > > H2.
> > > > > > > > > >> >>
> > > > > > > > > >> >> What’s for affinity keys?
Can we make an exception
> for
> > > them
> > > > > by
> > > > > > > > > >> defining in
> > > > > > > > > >> >> this part of the statement
> > > > > > > > > >> >>
> > > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > > >> >>    dept_id BIGINT AFFINITY
KEY,
> > > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > > >> >> );
> > > > > > > > > >> >>
> > > > > > > > > >> >> or that l
> > > > > > > > > >> >>
> > > > > > > > > >> >> CREATE TABLE employee (
> > > > > > > > > >> >>    id BIGINT PRIMARY KEY,
> > > > > > > > > >> >>    dept_id BIGINT,
> > > > > > > > > >> >>    name VARCHAR(128),
> > > > > > > > > >> >>    CONSTRAINT affKey AFFINITY
KEY(dept_id)
> > > > > > > > > >> >> );
> > > > > > > > > >> >>
> > > > > > > > > >> >> ?
> > > > > > > > > >> >>
> > > > > > > > > >> >> —
> > > > > > > > > >> >> Denis
> > > > > > > > > >> >>
> > > > > > > > > >> >>
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message