Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E8381200B4F for ; Tue, 26 Jul 2016 16:56:56 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id E6B50160A75; Tue, 26 Jul 2016 14:56:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DDAD6160A69 for ; Tue, 26 Jul 2016 16:56:55 +0200 (CEST) Received: (qmail 15546 invoked by uid 500); 26 Jul 2016 14:56:55 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 15534 invoked by uid 99); 26 Jul 2016 14:56:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Jul 2016 14:56:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 4DB0DCB861 for ; Tue, 26 Jul 2016 14:56:54 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.199 X-Spam-Level: * X-Spam-Status: No, score=1.199 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id RBUh5Mtlakvj for ; Tue, 26 Jul 2016 14:56:50 +0000 (UTC) Received: from mail-ua0-f169.google.com (mail-ua0-f169.google.com [209.85.217.169]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 3308B5FE66 for ; Tue, 26 Jul 2016 14:56:50 +0000 (UTC) Received: by mail-ua0-f169.google.com with SMTP id j59so2218760uaj.3 for ; Tue, 26 Jul 2016 07:56:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=oUu4tq+xU/Jf1DQNWiAjqAn5fOtF9FB/LWpZiKjSbXU=; b=AGMJB4pil9i+dYbav+Th8rj0F4w/Furg2GLxFJLHf6e13dyoBlH0qpG24mPzC0nirw R25+dFUEX7FedJNpznSBkpYfgCtZTppebeODipUH+hGew4DCzmS85z7NDmfP2eSha3gQ vdAQ85A0Ihh+ZbAG+WJVwDi37xHLqIfg1l7k6yhwO9bkJlz9UkLJYjsU4FCIw9lNLteD PDjBfQCGdehFmscBgrtB0wb1ewYIMP9xufmYQENEgFtvC9cjxFd9G+e9G+RMQV3UUM3p khWxExMfslAUHbA2enuP9mHSFO3/gHyM3OsxN5gYDlPLXuwJ04K2pcOFoTwZXk84dloe uDMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=oUu4tq+xU/Jf1DQNWiAjqAn5fOtF9FB/LWpZiKjSbXU=; b=DvJiLjtdru8xtJsvrzLgyLaXbs2KEumN63KeAFz7TvaUfBMg+CC1Odt9edW/d9hwEO 0niYHUH54ASUa5nmSa9rvYKk3UVdCz+q0lUj6ImONBE3UbGmdWN2wpSq8DTC141QVa6K 27NBMJpCltXKTTeiMx8M6wtSZV4jEWQOmkBbJgkye/Pe9JWxnii7TRuXsz+mYoNz34m0 o1W0xCg7QHEt8wlbX91O867OYjOddx3AY/ABvimCDOjI/LJPlYbbibndjz1dJHfixDQk 41XLzbaIAKo0HljtCKWmC4lyiDa0YhD3ukaapTGqwH+VfEkV/ZSAOUpfOst8/lmTO5S/ j6vA== X-Gm-Message-State: AEkoout4MnyUuDkt5QtGwBZFUhXdZuirj7O5ZK+JLgO/hPdBL53KLU9Nm4h47NSiALDafK/QkV4Duk2a554lQg== X-Received: by 10.176.64.129 with SMTP id i1mr8820535uad.156.1469545008902; Tue, 26 Jul 2016 07:56:48 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.49.2 with HTTP; Tue, 26 Jul 2016 07:56:09 -0700 (PDT) In-Reply-To: References: From: Sergi Vladykin Date: Tue, 26 Jul 2016 17:56:09 +0300 Message-ID: Subject: Re: IGNITE-2294 implementation details To: dev@ignite.apache.org Content-Type: multipart/alternative; boundary=94eb2c1243d63cbfd405388b1cd5 archived-at: Tue, 26 Jul 2016 14:56:57 -0000 --94eb2c1243d63cbfd405388b1cd5 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I don't like any pre-parsing, especially with some libraries other than H2. H2 itself has enough quirks to multiply it on quirks of another library. This is exactly what I was talking about - we need some single entry point on API for all the SQL commands and queries. Thats why I suggested SqlUpdate to extend Query. To me its is the cleanest approach. May be we need to change in some backward compatible way this Query hierarchy to get rid of extra methods but the idea is still the same. Sergi 2016-07-26 14:34 GMT+03:00 Alexander Paschenko < alexander.a.paschenko@gmail.com>: > Guys, > > I would like to advance the discussion further. There's one quite > important question that arose based on current state of work on this > issue. If we use some kind of interactive console, like Visor, then > how should it know whether SQL query it is requested to execute > returns a result set or not? In JDBC world, solution is quite simple - > there's base interface called Statement that all commands implement, > and it has magic isResultSet method that tells whether statement is a > query or an update command. The API proposed now has separate Query > and Update operations which I believe to be a right thing by the > reasons I outlined in the beginning of this thread. However, their > lack of common ancestor prevents possible console clients from running > text SQL commands in a fully transparent manner - like > IgniteCache.execute(String sql). Therefore I see two possible ways of > solving this: > > - we change API so that it includes new class or interface parenting > both Query and Update, and clients use it to communicate with cache > - we let (or make :) ) the client determine command type independently > and behave accordingly - for it to work it will have some kind of > command parsing by itself just to determine its type. Visor console > may use simple library like JSqlParser > (https://github.com/JSQLParser/JSqlParser; dual LGPL 2.1/ASF 2.0 > licensed) to determine request type in terms of JDBC, and behave > accordingly. > > Personally, I think that the second approach is better - and here's why. > > First, it does not seem wise to change API simply to make console (or > any other) clients simpler. Programmatic APIs should be concise and > short for programmatic use, console clients should be easy to use from > console - and that's it: after all, console client exists to free a > user from burden of doing things programmatically, so its aim is to > adapt API to console or whatever UI. > Second, possible complications in client implied by such approach > certainly won't be dramatic - I don't think that additional single > query parsing operation in client code will make it much harder to > develop. > Third, as I see it now, adding a new "synthetic" entity and new method > would take more effort to adapting the client to new API. > > Dmitry, Sergi, I would like to hear what you think about it all. Thanks. > > - Alex > > 2016-07-21 21:17 GMT+03:00 Dmitriy Setrakyan : > > OK, then using your analogy, the current behavior in Ignite is MERGE fo= r > > the most part. > > > > My preference is that Ignite SQL should work no different from > traditional > > databases, which means: > > > > - INSERT is translated into *putIfAbsent()* call in Ignite > > - UPDATE is translated into *replace()* call in Ignite > > - MERGE is translated into *put()* call in Ignite > > - For SQL BATCH calls we should delegate to Ignite batch operations, e.= g. > > *putAll()* > > > > The above should hold true for atomic and transactional put/putAll call= s, > > as well as for the data streamer. > > > > Does this make sense? > > > > D. > > > > On Thu, Jul 21, 2016 at 4:06 AM, Sergi Vladykin < > sergi.vladykin@gmail.com> > > wrote: > > > >> No, this does not make sense. > >> > >> There is no upsert mode in databases. There are operations: INSERT, > UPDATE, > >> DELETE, MERGE. > >> > >> I want to have clear understanding of how they have to behave in SQL > >> databases and how they will actually behave in Ignite in different > >> scenarios. Also I want to have clear understanding of performance > >> implications of each decision here. > >> > >> Anything wrong with that? > >> > >> Sergi > >> > >> On Thu, Jul 21, 2016 at 1:04 PM, Dmitriy Setrakyan < > dsetrakyan@apache.org> > >> wrote: > >> > >> > Serj, are you asking what will happen as of today? Then the answer t= o > all > >> > your questions is that duplicate keys are not an issue, and Ignite > always > >> > operates in **upsert** mode (which is essentially a *=E2=80=9Cput(= =E2=80=A6)=E2=80=9D > *method). > >> > > >> > However, the *=E2=80=9Cinsert=E2=80=9D* that is suggested by Alex wo= uld delegate to > >> > *=E2=80=9CputIfAbsent(=E2=80=A6)=E2=80=9D*, which in database world = makes more sense. > However, in > >> > this case, the *=E2=80=9Cupdate=E2=80=9D* syntax should delegate to = *=E2=80=9Creplace(=E2=80=A6)=E2=80=9D*, as > >> > update should fail in case if a key is absent. > >> > > >> > Considering the above, a notion of =E2=80=9C*upsert=E2=80=9D* or =E2= =80=9C*merge=E2=80=9D *operation > is > >> > very much needed, as it will give a user an option to perform > >> > =E2=80=9Cinsert-or-update=E2=80=9D in 1 call. > >> > > >> > Does this make sense? > >> > > >> > D. > >> > > >> > On Wed, Jul 20, 2016 at 9:39 PM, Sergi Vladykin < > >> sergi.vladykin@gmail.com> > >> > wrote: > >> > > >> > > I'd prefer to do MERGE operation last because in H2 it is not > standard > >> > ANSI > >> > > SQL MERGE. Or may be not implement it at all, or may be contribute > ANSI > >> > > correct version to H2, then implement it on Ignite. Need to > investigate > >> > the > >> > > semantics deeper before making any decisions here. > >> > > > >> > > Lets start with simple scenarios for INSERT and go through all the > >> > possible > >> > > cases and answer the questions: > >> > > - What will happen on key conflict in TX cache? > >> > > - What will happen on key conflict in Atomic cache? > >> > > - What will happen with the previous two if we use DataLoader? > >> > > - How to make these operations efficient (it will be simple enough > to > >> > > implement them with separate put/putIfAbsent operations but > probably we > >> > > will need some batching like putAllIfAbsent for efficiency)? > >> > > > >> > > As for API, we still will need to have a single entry point for al= l > SQL > >> > > queries/commands to allow any console work with it transparently. = It > >> > would > >> > > be great if we will be able to come up with something consistent > with > >> > this > >> > > idea on public API. > >> > > > >> > > Sergi > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > On Wed, Jul 20, 2016 at 2:23 PM, Dmitriy Setrakyan < > >> > > dsetrakyan@gridgain.com> > >> > > wrote: > >> > > > >> > > > Like the idea of merge and insert. I need more time to think abo= ut > >> the > >> > > API > >> > > > changes. > >> > > > > >> > > > Sergi, what do you think? > >> > > > > >> > > > Dmitriy > >> > > > > >> > > > > >> > > > > >> > > > On Jul 20, 2016, at 12:36 PM, Alexander Paschenko < > >> > > > alexander.a.paschenko@gmail.com> wrote: > >> > > > > >> > > > >> Thus, I suggest that we implement MERGE as a separate operati= on > >> > backed > >> > > > by putIfAbsent operation, while INSERT will be implemented via > put. > >> > > > > > >> > > > > Sorry, of course I meant that MERGE has to be put-based, while > >> INSERT > >> > > > > has to be putIfAbsent-based. > >> > > > > > >> > > > > 2016-07-20 12:30 GMT+03:00 Alexander Paschenko > >> > > > > : > >> > > > >> Hell Igniters, > >> > > > >> > >> > > > >> In this thread I would like to share and discuss some thought= s > on > >> > DML > >> > > > >> operations' implementation, so let's start and keep it here. > >> > Everyone > >> > > > >> is of course welcome to share their suggestions. > >> > > > >> > >> > > > >> For starters, I was thinking about semantics of INSERT. In > >> > traditional > >> > > > >> RDBMSs, INSERT works only for records whose primary keys don'= t > >> > > > >> conflict with those of records that are already persistent - > you > >> > can't > >> > > > >> try to insert the same key more than once because you'll get = an > >> > error. > >> > > > >> However, semantics of cache put is obviously different - it > does > >> not > >> > > > >> have anything about duplicate keys, it just quietly updates > values > >> > in > >> > > > >> case of keys' duplication. Still, cache has putIfAbsent > operation > >> > that > >> > > > >> is closer to traditional notion of INSERT, and H2's SQL diale= ct > >> has > >> > > > >> MERGE operation which corresponds to semantics of cache put. > >> Thus, I > >> > > > >> suggest that we implement MERGE as a separate operation backe= d > by > >> > > > >> putIfAbsent operation, while INSERT will be implemented via > put. > >> > > > >> > >> > > > >> And one more, probably more important thing: I suggest that w= e > >> > create > >> > > > >> separate class Update and corresponding operation update() in > >> > > > >> IgniteCache. The reasons are as follows: > >> > > > >> > >> > > > >> - Query bears some flags that are clearly redundant for Updat= e > >> (page > >> > > > >> size, locality) > >> > > > >> - query() method in IgniteCache (one that accepts Query) and > >> query() > >> > > > >> methods in GridQueryIndexing return iterators. So, if we > strive to > >> > > > >> leave interfaces unchanged, we still will introduce some desi= gn > >> > > > >> ugliness like query methods returning empty iterators for > certain > >> > > > >> queries, and/or query flags that indicate whether it's an > update > >> > query > >> > > > >> or not, etc. > >> > > > >> - If some Queries are update queries, then continuous queries > >> can't > >> > be > >> > > > >> based on them - more design-wise ugly checks and stuff like > that. > >> > > > >> - I'm pretty sure there's more I don't know about. > >> > > > >> > >> > > > >> Comments and suggestions are welcome. Sergi Vladykin, Dmitry > >> > > > >> Setrakyan, your opinions are of particular interest, please > >> advise. > >> > > > >> > >> > > > >> Regards, > >> > > > >> Alex > >> > > > > >> > > > >> > > >> > --94eb2c1243d63cbfd405388b1cd5--