ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Paschenko <alexander.a.pasche...@gmail.com>
Subject Re: DML data streaming
Date Thu, 09 Feb 2017 09:53:54 GMT

Streaming does not make sense for INSERT FROM SELECT as this pattern does
not match primary use case for streaming (bulk data load to Ignite).


No, I suggest that data streamer mode supports full semantic sense of
INSERT (throw an ex if there's a duplicate of PK) optionally and depending
on a flag (that is to be introduced). Currently new records are quietly
ignored on key duplication — it's really just a question of notifying the
user about duplicate keys in streaming mode.

Update by primary key is implemented now, but obviously it involves user
messing with _key column that we're planning to hide from them in near

Streaming is turned on via the flag, just as we've agreed in one of prev
threads. This thread is not about how we turn streaming on but rather about
semantic correctness of INSERT and MERGE in this mode and about whether we
need UPDATE and DELETE in it as they do not essentially load new data into
cache and (_in streaming mode_) make user mess with service columns of _key
and _val.

— Alex
8 февр. 2017 г. 11:33 PM пользователь "Dmitriy Setrakyan" <
dsetrakyan@apache.org> написал:

> Alexander,
> Are you suggesting that currently to execute a simple INSERT for 1 row we
> invoke a data streamer on Ignite API? How about an update by a primary key?
> Why not execute a simple cache put in either case?
> I think we had a separate thread where we agreed that the streamer should
> only be turned on if a certain flag on a JDBC connection is set, no?
> D.
> On Wed, Feb 8, 2017 at 7:00 AM, Alexander Paschenko <
> alexander.a.paschenko@gmail.com> wrote:
> > Hello Igniters,
> >
> > I'd like to raise few questions regarding data streaming via DML
> > statements.
> >
> > Currently, all types of DML statements are supported (INSERT, UPDATE,
> >
> > UPDATE and DELETE are supported in streaming mode only when their
> > WHERE condition is bounded with _key and/or _val columns, and UPDATE
> > works only for _val column directly.
> >
> > Seeing some activity in direction of hiding _key and _val from the
> > user as far as possible, these features seem pointless and should not
> > be released, what do you think?
> >
> > Also INSERT in streaming mode currently does not throw errors on
> > duplicate keys and silently ignores such new records (as long as it's
> > faster than it would work if we'd introduced receiver that would throw
> > exceptions) - this can be fixed with additional flag that could
> > _optionally_ make INSERT slower but more accurate in semantic.
> >
> > And MERGE in streaming mode currently not totally accurate in
> > semantic, too - on key presence, it will just replace whole value with
> > new one thus potentially making values of some concrete columns/fields
> > lost - this is analogous to
> > https://issues.apache.org/jira/browse/IGNITE-4489, but hardly can be
> > fixed as long as probably it would hit performance and would be
> > unresonably complex to implement.
> >
> > I suggest that we drop all except INSERT and introduce optional flag
> > for its totally correct semantic behavior as described above.
> >
> > - Alex
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message