ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Paschenko <alexander.a.pasche...@gmail.com>
Subject Re: IGNITE-2294 implementation details
Date Wed, 27 Jul 2016 09:40:04 GMT
Sergi,

You wrote:
> I'd prefer to return the same information, so it will not be empty

Do you mean return iterator with single element that denotes number of rows?

Dmitriy,

You wrote:
> What is the ticket number for this. Is the new API documented there?

Overall issue number is 2294. There's no particular issue on API
changes, but creating one seems to be a good idea, I will do it.

- Alex

2016-07-27 9:20 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
> What is the ticket number for this. Is the new API documented there?
>
> On Tue, Jul 26, 2016 at 11:36 AM, Sergi Vladykin <sergi.vladykin@gmail.com>
> wrote:
>
>> I don't see anything ugly in empty iterator, sorry if I insulted your taste
>> of beauty.
>>
>> If you will take a look at Jdbc, you will see that Statement.executeUpdate
>> method returns number of updated rows, I'd prefer to return the same
>> information, so it will not be empty (beauty is restored!).
>>
>> Sergi
>>
>>
>>
>> 2016-07-26 18:24 GMT+03:00 Alexander Paschenko <
>> alexander.a.paschenko@gmail.com>:
>>
>> > I see your point. But what about my concerns from initial post?
>> > Particularly about signatures of existing methods? I personally don't
>> > like an option of query() method always returning an empty iterator
>> > for any non-select query, it seems ugly design wise.
>> >
>> > - Alex
>> >
>> > 2016-07-26 18:15 GMT+03:00 Sergi Vladykin <sergi.vladykin@gmail.com>:
>> > > BTW, the simplest way to solve this issue is to allow running SQL
>> > commands
>> > > inside of SqlFieldsQuery.
>> > >
>> > > We may add some additional convenience API for updates if we want, but
>> > JDBC
>> > > client will always call it like this:
>> > >
>> > > cache.query(new SqlFieldsQuery("INSERT INTO MY_TABLE
>> > > VALUES(?,?)").setArgs(1,2));
>> > >
>> > > This will resolve any ambiguity.
>> > >
>> > > Sergi
>> > >
>> > > 2016-07-26 17:56 GMT+03:00 Sergi Vladykin <sergi.vladykin@gmail.com>:
>> > >
>> > >> I don't like any pre-parsing, especially with some libraries other
>> than
>> > >> H2. H2 itself has enough quirks to multiply it on quirks of another
>> > library.
>> > >>
>> > >> This is exactly what I was talking about - we need some single entry
>> > point
>> > >> on API for all the SQL commands and queries. Thats why I suggested
>> > >> SqlUpdate to extend Query. To me its is the cleanest approach. May
be
>> we
>> > >> need to change in some backward compatible way this Query hierarchy
to
>> > get
>> > >> rid of extra methods but the idea is still the same.
>> > >>
>> > >> Sergi
>> > >>
>> > >> 2016-07-26 14:34 GMT+03:00 Alexander Paschenko <
>> > >> alexander.a.paschenko@gmail.com>:
>> > >>
>> > >>> Guys,
>> > >>>
>> > >>> I would like to advance the discussion further. There's one quite
>> > >>> important question that arose based on current state of work on
this
>> > >>> issue. If we use some kind of interactive console, like Visor,
then
>> > >>> how should it know whether SQL query it is requested to execute
>> > >>> returns a result set or not? In JDBC world, solution is quite simple
>> -
>> > >>> there's base interface called Statement that all commands implement,
>> > >>> and it has magic isResultSet method that tells whether statement
is a
>> > >>> query or an update command. The API proposed now has separate Query
>> > >>> and Update operations which I believe to be a right thing by the
>> > >>> reasons I outlined in the beginning of this thread. However, their
>> > >>> lack of common ancestor prevents possible console clients from
>> running
>> > >>> text SQL commands in a fully transparent manner - like
>> > >>> IgniteCache.execute(String sql). Therefore I see two possible ways
of
>> > >>> solving this:
>> > >>>
>> > >>> - we change API so that it includes new class or interface parenting
>> > >>> both Query and Update, and clients use it to communicate with cache
>> > >>> - we let (or make :) ) the client determine command type
>> independently
>> > >>> and behave accordingly - for it to work it will have some kind
of
>> > >>> command parsing by itself just to determine its type. Visor console
>> > >>> may use simple library like JSqlParser
>> > >>> (https://github.com/JSQLParser/JSqlParser; dual LGPL 2.1/ASF 2.0
>> > >>> licensed) to determine request type in terms of JDBC, and behave
>> > >>> accordingly.
>> > >>>
>> > >>> Personally, I think that the second approach is better - and here's
>> > why.
>> > >>>
>> > >>> First, it does not seem wise to change API simply to make console
(or
>> > >>> any other) clients simpler. Programmatic APIs should be concise
and
>> > >>> short for programmatic use, console clients should be easy to use
>> from
>> > >>> console - and that's it: after all, console client exists to free
a
>> > >>> user from burden of doing things programmatically, so its aim is
to
>> > >>> adapt API to console or whatever UI.
>> > >>> Second, possible complications in client implied by such approach
>> > >>> certainly won't be dramatic - I don't think that additional single
>> > >>> query parsing operation in client code will make it much harder
to
>> > >>> develop.
>> > >>> Third, as I see it now, adding a new "synthetic" entity and new
>> method
>> > >>> would take more effort to adapting the client to new API.
>> > >>>
>> > >>> Dmitry, Sergi, I would like to hear what you think about it all.
>> > Thanks.
>> > >>>
>> > >>> - Alex
>> > >>>
>> > >>> 2016-07-21 21:17 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org
>> >:
>> > >>> > OK, then using your analogy, the current behavior in Ignite
is
>> MERGE
>> > for
>> > >>> > the most part.
>> > >>> >
>> > >>> > My preference is that Ignite SQL should work no different
from
>> > >>> traditional
>> > >>> > databases, which means:
>> > >>> >
>> > >>> > - INSERT is translated into *putIfAbsent()* call in Ignite
>> > >>> > - UPDATE is translated into *replace()* call in Ignite
>> > >>> > - MERGE is translated into *put()* call in Ignite
>> > >>> > - For SQL BATCH calls we should delegate to Ignite batch
>> operations,
>> > >>> e.g.
>> > >>> > *putAll()*
>> > >>> >
>> > >>> > The above should hold true for atomic and transactional put/putAll
>> > >>> calls,
>> > >>> > as well as for the data streamer.
>> > >>> >
>> > >>> > Does this make sense?
>> > >>> >
>> > >>> > D.
>> > >>> >
>> > >>> > On Thu, Jul 21, 2016 at 4:06 AM, Sergi Vladykin <
>> > >>> sergi.vladykin@gmail.com>
>> > >>> > wrote:
>> > >>> >
>> > >>> >> No, this does not make sense.
>> > >>> >>
>> > >>> >> There is no upsert mode in databases. There are operations:
>> INSERT,
>> > >>> UPDATE,
>> > >>> >> DELETE, MERGE.
>> > >>> >>
>> > >>> >> I want to have clear understanding of how they have to
behave in
>> SQL
>> > >>> >> databases and how they will actually behave in Ignite
in different
>> > >>> >> scenarios. Also I want to have clear understanding of
performance
>> > >>> >> implications of each decision here.
>> > >>> >>
>> > >>> >> Anything wrong with that?
>> > >>> >>
>> > >>> >> Sergi
>> > >>> >>
>> > >>> >> On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy Setrakyan <
>> > >>> dsetrakyan@apache.org>
>> > >>> >> wrote:
>> > >>> >>
>> > >>> >> > Serj, are you asking what will happen as of today?
Then the
>> answer
>> > >>> to all
>> > >>> >> > your questions is that duplicate keys are not an
issue, and
>> Ignite
>> > >>> always
>> > >>> >> > operates in **upsert** mode (which is essentially
a *“put(…)”
>> > >>> *method).
>> > >>> >> >
>> > >>> >> > However, the *“insert”* that is suggested by
Alex would delegate
>> > to
>> > >>> >> > *“putIfAbsent(…)”*, which in database world
makes more sense.
>> > >>> However, in
>> > >>> >> > this case, the *“update”* syntax should delegate
to
>> > *“replace(…)”*,
>> > >>> as
>> > >>> >> > update should fail in case if a key is absent.
>> > >>> >> >
>> > >>> >> > Considering the above, a notion of “*upsert”*
or “*merge”
>> > *operation
>> > >>> is
>> > >>> >> > very much needed, as it will give a user an option
to perform
>> > >>> >> > “insert-or-update” in 1 call.
>> > >>> >> >
>> > >>> >> > Does this make sense?
>> > >>> >> >
>> > >>> >> > D.
>> > >>> >> >
>> > >>> >> > On Wed, Jul 20, 2016 at 9:39 PM, Sergi Vladykin <
>> > >>> >> sergi.vladykin@gmail.com>
>> > >>> >> > wrote:
>> > >>> >> >
>> > >>> >> > > I'd prefer to do MERGE operation last because
in H2 it is not
>> > >>> standard
>> > >>> >> > ANSI
>> > >>> >> > > SQL MERGE. Or may be not implement it at all,
or may be
>> > contribute
>> > >>> ANSI
>> > >>> >> > > correct version to H2, then implement it on
Ignite. Need to
>> > >>> investigate
>> > >>> >> > the
>> > >>> >> > > semantics deeper before making any decisions
here.
>> > >>> >> > >
>> > >>> >> > > Lets start with simple scenarios for INSERT
and go through all
>> > the
>> > >>> >> > possible
>> > >>> >> > > cases and answer the questions:
>> > >>> >> > > - What will happen on key conflict in TX cache?
>> > >>> >> > > - What will happen on key conflict in Atomic
cache?
>> > >>> >> > > - What will happen with the previous two if
we use DataLoader?
>> > >>> >> > > - How to make these operations efficient (it
will be simple
>> > enough
>> > >>> to
>> > >>> >> > > implement them with separate put/putIfAbsent
operations but
>> > >>> probably we
>> > >>> >> > > will need some batching like putAllIfAbsent
for efficiency)?
>> > >>> >> > >
>> > >>> >> > > As for API, we still will need to have a single
entry point
>> for
>> > >>> all SQL
>> > >>> >> > > queries/commands to allow any console work with
it
>> > transparently.
>> > >>> It
>> > >>> >> > would
>> > >>> >> > > be great if we will be able to come up with
something
>> consistent
>> > >>> with
>> > >>> >> > this
>> > >>> >> > > idea on public API.
>> > >>> >> > >
>> > >>> >> > > Sergi
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > >
>> > >>> >> > > On Wed, Jul 20, 2016 at 2:23 PM, Dmitriy Setrakyan
<
>> > >>> >> > > dsetrakyan@gridgain.com>
>> > >>> >> > > wrote:
>> > >>> >> > >
>> > >>> >> > > > Like the idea of merge and insert. I need
more time to think
>> > >>> about
>> > >>> >> the
>> > >>> >> > > API
>> > >>> >> > > > changes.
>> > >>> >> > > >
>> > >>> >> > > > Sergi, what do you think?
>> > >>> >> > > >
>> > >>> >> > > > Dmitriy
>> > >>> >> > > >
>> > >>> >> > > >
>> > >>> >> > > >
>> > >>> >> > > > On Jul 20, 2016, at 12:36 PM, Alexander
Paschenko <
>> > >>> >> > > > alexander.a.paschenko@gmail.com> wrote:
>> > >>> >> > > >
>> > >>> >> > > > >> Thus, I suggest that we implement
MERGE as a separate
>> > >>> operation
>> > >>> >> > backed
>> > >>> >> > > > by putIfAbsent operation, while INSERT
will be implemented
>> via
>> > >>> put.
>> > >>> >> > > > >
>> > >>> >> > > > > Sorry, of course I meant that MERGE
has to be put-based,
>> > while
>> > >>> >> INSERT
>> > >>> >> > > > > has to be putIfAbsent-based.
>> > >>> >> > > > >
>> > >>> >> > > > > 2016-07-20 12:30 GMT+03:00 Alexander
Paschenko
>> > >>> >> > > > > <alexander.a.paschenko@gmail.com>:
>> > >>> >> > > > >> Hell Igniters,
>> > >>> >> > > > >>
>> > >>> >> > > > >> In this thread I would like to
share and discuss some
>> > >>> thoughts on
>> > >>> >> > DML
>> > >>> >> > > > >> operations' implementation, so
let's start and keep it
>> > here.
>> > >>> >> > Everyone
>> > >>> >> > > > >> is of course welcome to share
their suggestions.
>> > >>> >> > > > >>
>> > >>> >> > > > >> For starters, I was thinking about
semantics of INSERT.
>> In
>> > >>> >> > traditional
>> > >>> >> > > > >> RDBMSs, INSERT works only for
records whose primary keys
>> > don't
>> > >>> >> > > > >> conflict with those of records
that are already
>> persistent
>> > -
>> > >>> you
>> > >>> >> > can't
>> > >>> >> > > > >> try to insert the same key more
than once because you'll
>> > get
>> > >>> an
>> > >>> >> > error.
>> > >>> >> > > > >> However, semantics of cache put
is obviously different -
>> it
>> > >>> does
>> > >>> >> not
>> > >>> >> > > > >> have anything about duplicate
keys, it just quietly
>> updates
>> > >>> values
>> > >>> >> > in
>> > >>> >> > > > >> case of keys' duplication. Still,
cache has putIfAbsent
>> > >>> operation
>> > >>> >> > that
>> > >>> >> > > > >> is closer to traditional notion
of INSERT, and H2's SQL
>> > >>> dialect
>> > >>> >> has
>> > >>> >> > > > >> MERGE operation which corresponds
to semantics of cache
>> > put.
>> > >>> >> Thus, I
>> > >>> >> > > > >> suggest that we implement MERGE
as a separate operation
>> > >>> backed by
>> > >>> >> > > > >> putIfAbsent operation, while INSERT
will be implemented
>> via
>> > >>> put.
>> > >>> >> > > > >>
>> > >>> >> > > > >> And one more, probably more important
thing: I suggest
>> > that we
>> > >>> >> > create
>> > >>> >> > > > >> separate class Update and corresponding
operation
>> update()
>> > in
>> > >>> >> > > > >> IgniteCache. The reasons are as
follows:
>> > >>> >> > > > >>
>> > >>> >> > > > >> - Query bears some flags that
are clearly redundant for
>> > Update
>> > >>> >> (page
>> > >>> >> > > > >> size, locality)
>> > >>> >> > > > >> - query() method in IgniteCache
(one that accepts Query)
>> > and
>> > >>> >> query()
>> > >>> >> > > > >> methods in GridQueryIndexing return
iterators. So, if we
>> > >>> strive to
>> > >>> >> > > > >> leave interfaces unchanged, we
still will introduce some
>> > >>> design
>> > >>> >> > > > >> ugliness like query methods returning
empty iterators for
>> > >>> certain
>> > >>> >> > > > >> queries, and/or query flags that
indicate whether it's an
>> > >>> update
>> > >>> >> > query
>> > >>> >> > > > >> or not, etc.
>> > >>> >> > > > >> - If some Queries are update queries,
then continuous
>> > queries
>> > >>> >> can't
>> > >>> >> > be
>> > >>> >> > > > >> based on them - more design-wise
ugly checks and stuff
>> like
>> > >>> that.
>> > >>> >> > > > >> - I'm pretty sure there's more
I don't know about.
>> > >>> >> > > > >>
>> > >>> >> > > > >> Comments and suggestions are welcome.
Sergi Vladykin,
>> > Dmitry
>> > >>> >> > > > >> Setrakyan, your opinions are of
particular interest,
>> please
>> > >>> >> advise.
>> > >>> >> > > > >>
>> > >>> >> > > > >> Regards,
>> > >>> >> > > > >> Alex
>> > >>> >> > > >
>> > >>> >> > >
>> > >>> >> >
>> > >>> >>
>> > >>>
>> > >>
>> > >>
>> >
>>

Mime
View raw message