ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Paschenko <alexander.a.pasche...@gmail.com>
Subject Re: IGNITE-2294 implementation details
Date Wed, 27 Jul 2016 11:15:24 GMT
OK, I've found that bold cast to QueryCursor<R> in IgniteCacheProxy
and had a look at how SqlFieldsQuery is used in JDBC driver. Thanks.

- Alex

2016-07-27 13:02 GMT+03:00 Sergi Vladykin <sergi.vladykin@gmail.com>:
> Where did you see R in SqlFieldsQuery?
>
> Sergi
>
> 2016-07-27 12:59 GMT+03:00 Alexander Paschenko <
> alexander.a.paschenko@gmail.com>:
>
>> Sergi,
>>
>> But current signature of query() method returns not just some
>> iterator, but rather iterator of R which is type param of Query -
>> i.e., we won't be able to return an int inside a QueryCursor<R>. At
>> least without API change (signature of query() method will have to be
>> changed to drop genericness, or in some other weird way). Is this what
>> we really want? Or am I missing something in your point?
>>
>> - Alex
>>
>> 2016-07-27 12:51 GMT+03:00 Sergi Vladykin <sergi.vladykin@gmail.com>:
>> > Exactly. This will allow our Jdbc driver to work transparently.
>> >
>> > Sergi
>> >
>> > 2016-07-27 12:40 GMT+03:00 Alexander Paschenko <
>> > alexander.a.paschenko@gmail.com>:
>> >
>> >> Sergi,
>> >>
>> >> You wrote:
>> >> > I'd prefer to return the same information, so it will not be empty
>> >>
>> >> Do you mean return iterator with single element that denotes number of
>> >> rows?
>> >>
>> >> Dmitriy,
>> >>
>> >> You wrote:
>> >> > What is the ticket number for this. Is the new API documented there?
>> >>
>> >> Overall issue number is 2294. There's no particular issue on API
>> >> changes, but creating one seems to be a good idea, I will do it.
>> >>
>> >> - Alex
>> >>
>> >> 2016-07-27 9:20 GMT+03:00 Dmitriy Setrakyan <dsetrakyan@apache.org>:
>> >> > What is the ticket number for this. Is the new API documented there?
>> >> >
>> >> > On Tue, Jul 26, 2016 at 11:36 AM, Sergi Vladykin <
>> >> sergi.vladykin@gmail.com>
>> >> > wrote:
>> >> >
>> >> >> I don't see anything ugly in empty iterator, sorry if I insulted
your
>> >> taste
>> >> >> of beauty.
>> >> >>
>> >> >> If you will take a look at Jdbc, you will see that
>> >> Statement.executeUpdate
>> >> >> method returns number of updated rows, I'd prefer to return the
same
>> >> >> information, so it will not be empty (beauty is restored!).
>> >> >>
>> >> >> Sergi
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2016-07-26 18:24 GMT+03:00 Alexander Paschenko <
>> >> >> alexander.a.paschenko@gmail.com>:
>> >> >>
>> >> >> > I see your point. But what about my concerns from initial
post?
>> >> >> > Particularly about signatures of existing methods? I personally
>> don't
>> >> >> > like an option of query() method always returning an empty
iterator
>> >> >> > for any non-select query, it seems ugly design wise.
>> >> >> >
>> >> >> > - Alex
>> >> >> >
>> >> >> > 2016-07-26 18:15 GMT+03:00 Sergi Vladykin <
>> sergi.vladykin@gmail.com>:
>> >> >> > > BTW, the simplest way to solve this issue is to allow
running SQL
>> >> >> > commands
>> >> >> > > inside of SqlFieldsQuery.
>> >> >> > >
>> >> >> > > We may add some additional convenience API for updates
if we
>> want,
>> >> but
>> >> >> > JDBC
>> >> >> > > client will always call it like this:
>> >> >> > >
>> >> >> > > cache.query(new SqlFieldsQuery("INSERT INTO MY_TABLE
>> >> >> > > VALUES(?,?)").setArgs(1,2));
>> >> >> > >
>> >> >> > > This will resolve any ambiguity.
>> >> >> > >
>> >> >> > > Sergi
>> >> >> > >
>> >> >> > > 2016-07-26 17:56 GMT+03:00 Sergi Vladykin <
>> sergi.vladykin@gmail.com
>> >> >:
>> >> >> > >
>> >> >> > >> I don't like any pre-parsing, especially with some
libraries
>> other
>> >> >> than
>> >> >> > >> H2. H2 itself has enough quirks to multiply it on
quirks of
>> another
>> >> >> > library.
>> >> >> > >>
>> >> >> > >> This is exactly what I was talking about - we need
some single
>> >> entry
>> >> >> > point
>> >> >> > >> on API for all the SQL commands and queries. Thats
why I
>> suggested
>> >> >> > >> SqlUpdate to extend Query. To me its is the cleanest
approach.
>> May
>> >> be
>> >> >> we
>> >> >> > >> need to change in some backward compatible way this
Query
>> >> hierarchy to
>> >> >> > get
>> >> >> > >> rid of extra methods but the idea is still the same.
>> >> >> > >>
>> >> >> > >> Sergi
>> >> >> > >>
>> >> >> > >> 2016-07-26 14:34 GMT+03:00 Alexander Paschenko <
>> >> >> > >> alexander.a.paschenko@gmail.com>:
>> >> >> > >>
>> >> >> > >>> Guys,
>> >> >> > >>>
>> >> >> > >>> I would like to advance the discussion further.
There's one
>> quite
>> >> >> > >>> important question that arose based on current
state of work on
>> >> this
>> >> >> > >>> issue. If we use some kind of interactive console,
like Visor,
>> >> then
>> >> >> > >>> how should it know whether SQL query it is requested
to execute
>> >> >> > >>> returns a result set or not? In JDBC world, solution
is quite
>> >> simple
>> >> >> -
>> >> >> > >>> there's base interface called Statement that
all commands
>> >> implement,
>> >> >> > >>> and it has magic isResultSet method that tells
whether
>> statement
>> >> is a
>> >> >> > >>> query or an update command. The API proposed
now has separate
>> >> Query
>> >> >> > >>> and Update operations which I believe to be a
right thing by
>> the
>> >> >> > >>> reasons I outlined in the beginning of this thread.
However,
>> their
>> >> >> > >>> lack of common ancestor prevents possible console
clients from
>> >> >> running
>> >> >> > >>> text SQL commands in a fully transparent manner
- like
>> >> >> > >>> IgniteCache.execute(String sql). Therefore I
see two possible
>> >> ways of
>> >> >> > >>> solving this:
>> >> >> > >>>
>> >> >> > >>> - we change API so that it includes new class
or interface
>> >> parenting
>> >> >> > >>> both Query and Update, and clients use it to
communicate with
>> >> cache
>> >> >> > >>> - we let (or make :) ) the client determine command
type
>> >> >> independently
>> >> >> > >>> and behave accordingly - for it to work it will
have some kind
>> of
>> >> >> > >>> command parsing by itself just to determine its
type. Visor
>> >> console
>> >> >> > >>> may use simple library like JSqlParser
>> >> >> > >>> (https://github.com/JSQLParser/JSqlParser; dual
LGPL 2.1/ASF
>> 2.0
>> >> >> > >>> licensed) to determine request type in terms
of JDBC, and
>> behave
>> >> >> > >>> accordingly.
>> >> >> > >>>
>> >> >> > >>> Personally, I think that the second approach
is better - and
>> >> here's
>> >> >> > why.
>> >> >> > >>>
>> >> >> > >>> First, it does not seem wise to change API simply
to make
>> console
>> >> (or
>> >> >> > >>> any other) clients simpler. Programmatic APIs
should be concise
>> >> and
>> >> >> > >>> short for programmatic use, console clients should
be easy to
>> use
>> >> >> from
>> >> >> > >>> console - and that's it: after all, console client
exists to
>> free
>> >> a
>> >> >> > >>> user from burden of doing things programmatically,
so its aim
>> is
>> >> to
>> >> >> > >>> adapt API to console or whatever UI.
>> >> >> > >>> Second, possible complications in client implied
by such
>> approach
>> >> >> > >>> certainly won't be dramatic - I don't think that
additional
>> single
>> >> >> > >>> query parsing operation in client code will make
it much
>> harder to
>> >> >> > >>> develop.
>> >> >> > >>> Third, as I see it now, adding a new "synthetic"
entity and new
>> >> >> method
>> >> >> > >>> would take more effort to adapting the client
to new API.
>> >> >> > >>>
>> >> >> > >>> Dmitry, Sergi, I would like to hear what you
think about it
>> all.
>> >> >> > Thanks.
>> >> >> > >>>
>> >> >> > >>> - Alex
>> >> >> > >>>
>> >> >> > >>> 2016-07-21 21:17 GMT+03:00 Dmitriy Setrakyan
<
>> >> dsetrakyan@apache.org
>> >> >> >:
>> >> >> > >>> > OK, then using your analogy, the current
behavior in Ignite
>> is
>> >> >> MERGE
>> >> >> > for
>> >> >> > >>> > the most part.
>> >> >> > >>> >
>> >> >> > >>> > My preference is that Ignite SQL should
work no different
>> from
>> >> >> > >>> traditional
>> >> >> > >>> > databases, which means:
>> >> >> > >>> >
>> >> >> > >>> > - INSERT is translated into *putIfAbsent()*
call in Ignite
>> >> >> > >>> > - UPDATE is translated into *replace()*
call in Ignite
>> >> >> > >>> > - MERGE is translated into *put()* call
in Ignite
>> >> >> > >>> > - For SQL BATCH calls we should delegate
to Ignite batch
>> >> >> operations,
>> >> >> > >>> e.g.
>> >> >> > >>> > *putAll()*
>> >> >> > >>> >
>> >> >> > >>> > The above should hold true for atomic and
transactional
>> >> put/putAll
>> >> >> > >>> calls,
>> >> >> > >>> > as well as for the data streamer.
>> >> >> > >>> >
>> >> >> > >>> > Does this make sense?
>> >> >> > >>> >
>> >> >> > >>> > D.
>> >> >> > >>> >
>> >> >> > >>> > On Thu, Jul 21, 2016 at 4:06 AM, Sergi Vladykin
<
>> >> >> > >>> sergi.vladykin@gmail.com>
>> >> >> > >>> > wrote:
>> >> >> > >>> >
>> >> >> > >>> >> No, this does not make sense.
>> >> >> > >>> >>
>> >> >> > >>> >> There is no upsert mode in databases.
There are operations:
>> >> >> INSERT,
>> >> >> > >>> UPDATE,
>> >> >> > >>> >> DELETE, MERGE.
>> >> >> > >>> >>
>> >> >> > >>> >> I want to have clear understanding of
how they have to
>> behave
>> >> in
>> >> >> SQL
>> >> >> > >>> >> databases and how they will actually
behave in Ignite in
>> >> different
>> >> >> > >>> >> scenarios. Also I want to have clear
understanding of
>> >> performance
>> >> >> > >>> >> implications of each decision here.
>> >> >> > >>> >>
>> >> >> > >>> >> Anything wrong with that?
>> >> >> > >>> >>
>> >> >> > >>> >> Sergi
>> >> >> > >>> >>
>> >> >> > >>> >> On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy
Setrakyan <
>> >> >> > >>> dsetrakyan@apache.org>
>> >> >> > >>> >> wrote:
>> >> >> > >>> >>
>> >> >> > >>> >> > Serj, are you asking what will
happen as of today? Then
>> the
>> >> >> answer
>> >> >> > >>> to all
>> >> >> > >>> >> > your questions is that duplicate
keys are not an issue,
>> and
>> >> >> Ignite
>> >> >> > >>> always
>> >> >> > >>> >> > operates in **upsert** mode (which
is essentially a
>> *“put(…)”
>> >> >> > >>> *method).
>> >> >> > >>> >> >
>> >> >> > >>> >> > However, the *“insert”* that
is suggested by Alex would
>> >> delegate
>> >> >> > to
>> >> >> > >>> >> > *“putIfAbsent(…)”*, which
in database world makes more
>> sense.
>> >> >> > >>> However, in
>> >> >> > >>> >> > this case, the *“update”* syntax
should delegate to
>> >> >> > *“replace(…)”*,
>> >> >> > >>> as
>> >> >> > >>> >> > update should fail in case if a
key is absent.
>> >> >> > >>> >> >
>> >> >> > >>> >> > Considering the above, a notion
of “*upsert”* or “*merge”
>> >> >> > *operation
>> >> >> > >>> is
>> >> >> > >>> >> > very much needed, as it will give
a user an option to
>> perform
>> >> >> > >>> >> > “insert-or-update” in 1 call.
>> >> >> > >>> >> >
>> >> >> > >>> >> > Does this make sense?
>> >> >> > >>> >> >
>> >> >> > >>> >> > D.
>> >> >> > >>> >> >
>> >> >> > >>> >> > On Wed, Jul 20, 2016 at 9:39 PM,
Sergi Vladykin <
>> >> >> > >>> >> sergi.vladykin@gmail.com>
>> >> >> > >>> >> > wrote:
>> >> >> > >>> >> >
>> >> >> > >>> >> > > I'd prefer to do MERGE operation
last because in H2 it
>> is
>> >> not
>> >> >> > >>> standard
>> >> >> > >>> >> > ANSI
>> >> >> > >>> >> > > SQL MERGE. Or may be not implement
it at all, or may be
>> >> >> > contribute
>> >> >> > >>> ANSI
>> >> >> > >>> >> > > correct version to H2, then
implement it on Ignite.
>> Need to
>> >> >> > >>> investigate
>> >> >> > >>> >> > the
>> >> >> > >>> >> > > semantics deeper before making
any decisions here.
>> >> >> > >>> >> > >
>> >> >> > >>> >> > > Lets start with simple scenarios
for INSERT and go
>> through
>> >> all
>> >> >> > the
>> >> >> > >>> >> > possible
>> >> >> > >>> >> > > cases and answer the questions:
>> >> >> > >>> >> > > - What will happen on key
conflict in TX cache?
>> >> >> > >>> >> > > - What will happen on key
conflict in Atomic cache?
>> >> >> > >>> >> > > - What will happen with the
previous two if we use
>> >> DataLoader?
>> >> >> > >>> >> > > - How to make these operations
efficient (it will be
>> simple
>> >> >> > enough
>> >> >> > >>> to
>> >> >> > >>> >> > > implement them with separate
put/putIfAbsent operations
>> but
>> >> >> > >>> probably we
>> >> >> > >>> >> > > will need some batching like
putAllIfAbsent for
>> >> efficiency)?
>> >> >> > >>> >> > >
>> >> >> > >>> >> > > As for API, we still will
need to have a single entry
>> point
>> >> >> for
>> >> >> > >>> all SQL
>> >> >> > >>> >> > > queries/commands to allow
any console work with it
>> >> >> > transparently.
>> >> >> > >>> It
>> >> >> > >>> >> > would
>> >> >> > >>> >> > > be great if we will be able
to come up with something
>> >> >> consistent
>> >> >> > >>> with
>> >> >> > >>> >> > this
>> >> >> > >>> >> > > idea on public API.
>> >> >> > >>> >> > >
>> >> >> > >>> >> > > Sergi
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> > > On Wed, Jul 20, 2016 at 2:23
PM, Dmitriy Setrakyan <
>> >> >> > >>> >> > > dsetrakyan@gridgain.com>
>> >> >> > >>> >> > > wrote:
>> >> >> > >>> >> > >
>> >> >> > >>> >> > > > Like the idea of merge
and insert. I need more time to
>> >> think
>> >> >> > >>> about
>> >> >> > >>> >> the
>> >> >> > >>> >> > > API
>> >> >> > >>> >> > > > changes.
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > > Sergi, what do you think?
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > > Dmitriy
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > > On Jul 20, 2016, at 12:36
PM, Alexander Paschenko <
>> >> >> > >>> >> > > > alexander.a.paschenko@gmail.com>
wrote:
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > > > >> Thus, I suggest
that we implement MERGE as a
>> separate
>> >> >> > >>> operation
>> >> >> > >>> >> > backed
>> >> >> > >>> >> > > > by putIfAbsent operation,
while INSERT will be
>> >> implemented
>> >> >> via
>> >> >> > >>> put.
>> >> >> > >>> >> > > > >
>> >> >> > >>> >> > > > > Sorry, of course
I meant that MERGE has to be
>> >> put-based,
>> >> >> > while
>> >> >> > >>> >> INSERT
>> >> >> > >>> >> > > > > has to be putIfAbsent-based.
>> >> >> > >>> >> > > > >
>> >> >> > >>> >> > > > > 2016-07-20 12:30
GMT+03:00 Alexander Paschenko
>> >> >> > >>> >> > > > > <alexander.a.paschenko@gmail.com>:
>> >> >> > >>> >> > > > >> Hell Igniters,
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> In this thread
I would like to share and discuss
>> some
>> >> >> > >>> thoughts on
>> >> >> > >>> >> > DML
>> >> >> > >>> >> > > > >> operations'
implementation, so let's start and
>> keep it
>> >> >> > here.
>> >> >> > >>> >> > Everyone
>> >> >> > >>> >> > > > >> is of course
welcome to share their suggestions.
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> For starters,
I was thinking about semantics of
>> >> INSERT.
>> >> >> In
>> >> >> > >>> >> > traditional
>> >> >> > >>> >> > > > >> RDBMSs, INSERT
works only for records whose primary
>> >> keys
>> >> >> > don't
>> >> >> > >>> >> > > > >> conflict with
those of records that are already
>> >> >> persistent
>> >> >> > -
>> >> >> > >>> you
>> >> >> > >>> >> > can't
>> >> >> > >>> >> > > > >> try to insert
the same key more than once because
>> >> you'll
>> >> >> > get
>> >> >> > >>> an
>> >> >> > >>> >> > error.
>> >> >> > >>> >> > > > >> However, semantics
of cache put is obviously
>> >> different -
>> >> >> it
>> >> >> > >>> does
>> >> >> > >>> >> not
>> >> >> > >>> >> > > > >> have anything
about duplicate keys, it just quietly
>> >> >> updates
>> >> >> > >>> values
>> >> >> > >>> >> > in
>> >> >> > >>> >> > > > >> case of keys'
duplication. Still, cache has
>> >> putIfAbsent
>> >> >> > >>> operation
>> >> >> > >>> >> > that
>> >> >> > >>> >> > > > >> is closer to
traditional notion of INSERT, and H2's
>> >> SQL
>> >> >> > >>> dialect
>> >> >> > >>> >> has
>> >> >> > >>> >> > > > >> MERGE operation
which corresponds to semantics of
>> >> cache
>> >> >> > put.
>> >> >> > >>> >> Thus, I
>> >> >> > >>> >> > > > >> suggest that
we implement MERGE as a separate
>> >> operation
>> >> >> > >>> backed by
>> >> >> > >>> >> > > > >> putIfAbsent
operation, while INSERT will be
>> >> implemented
>> >> >> via
>> >> >> > >>> put.
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> And one more,
probably more important thing: I
>> suggest
>> >> >> > that we
>> >> >> > >>> >> > create
>> >> >> > >>> >> > > > >> separate class
Update and corresponding operation
>> >> >> update()
>> >> >> > in
>> >> >> > >>> >> > > > >> IgniteCache.
The reasons are as follows:
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> - Query bears
some flags that are clearly redundant
>> >> for
>> >> >> > Update
>> >> >> > >>> >> (page
>> >> >> > >>> >> > > > >> size, locality)
>> >> >> > >>> >> > > > >> - query() method
in IgniteCache (one that accepts
>> >> Query)
>> >> >> > and
>> >> >> > >>> >> query()
>> >> >> > >>> >> > > > >> methods in GridQueryIndexing
return iterators. So,
>> if
>> >> we
>> >> >> > >>> strive to
>> >> >> > >>> >> > > > >> leave interfaces
unchanged, we still will introduce
>> >> some
>> >> >> > >>> design
>> >> >> > >>> >> > > > >> ugliness like
query methods returning empty
>> iterators
>> >> for
>> >> >> > >>> certain
>> >> >> > >>> >> > > > >> queries, and/or
query flags that indicate whether
>> >> it's an
>> >> >> > >>> update
>> >> >> > >>> >> > query
>> >> >> > >>> >> > > > >> or not, etc.
>> >> >> > >>> >> > > > >> - If some Queries
are update queries, then
>> continuous
>> >> >> > queries
>> >> >> > >>> >> can't
>> >> >> > >>> >> > be
>> >> >> > >>> >> > > > >> based on them
- more design-wise ugly checks and
>> stuff
>> >> >> like
>> >> >> > >>> that.
>> >> >> > >>> >> > > > >> - I'm pretty
sure there's more I don't know about.
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> Comments and
suggestions are welcome. Sergi
>> Vladykin,
>> >> >> > Dmitry
>> >> >> > >>> >> > > > >> Setrakyan, your
opinions are of particular
>> interest,
>> >> >> please
>> >> >> > >>> >> advise.
>> >> >> > >>> >> > > > >>
>> >> >> > >>> >> > > > >> Regards,
>> >> >> > >>> >> > > > >> Alex
>> >> >> > >>> >> > > >
>> >> >> > >>> >> > >
>> >> >> > >>> >> >
>> >> >> > >>> >>
>> >> >> > >>>
>> >> >> > >>
>> >> >> > >>
>> >> >> >
>> >> >>
>> >>
>>

Mime
View raw message