From dev-return-48354-archive-asf-public=cust-asf.ponee.io@ignite.apache.org Mon Nov 18 16:42:27 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 8EF4E180657 for ; Mon, 18 Nov 2019 17:42:27 +0100 (CET) Received: (qmail 96403 invoked by uid 500); 18 Nov 2019 16:42:26 -0000 Mailing-List: contact dev-help@ignite.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ignite.apache.org Delivered-To: mailing list dev@ignite.apache.org Received: (qmail 96391 invoked by uid 99); 18 Nov 2019 16:42:26 -0000 Received: from Unknown (HELO mailrelay1-lw-us.apache.org) (10.10.3.42) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 18 Nov 2019 16:42:26 +0000 Received: from mail-il1-f171.google.com (mail-il1-f171.google.com [209.85.166.171]) by mailrelay1-lw-us.apache.org (ASF Mail Server at mailrelay1-lw-us.apache.org) with ESMTPSA id 5BC374FB5 for ; Mon, 18 Nov 2019 16:42:26 +0000 (UTC) Received: by mail-il1-f171.google.com with SMTP id i6so7194090ilr.11 for ; Mon, 18 Nov 2019 08:42:26 -0800 (PST) X-Gm-Message-State: APjAAAWW0DHNLq2Cwsid0mqZpo5mQoh/EgtIkhssamF9PvZci5o96WyW OfA48lejE5AyLcAyJql39tovSyfQ3QsRpmR2HQL0Og== X-Google-Smtp-Source: APXvYqwyiHaZfq8kLmt2jnQxmopCP9GbIcCzYDQp+RGt5yVbeLPU/atbDgNeY714wzZL+xjc3VaZoxJQE2PJwdWXHKU= X-Received: by 2002:a92:9f5d:: with SMTP id u90mr17887132ili.13.1574095345735; Mon, 18 Nov 2019 08:42:25 -0800 (PST) MIME-Version: 1.0 References: <1cda218b28b8dfbbf097cdb01ce99a5c836f94da.camel@gmail.com> In-Reply-To: From: Denis Magda Date: Mon, 18 Nov 2019 08:41:59 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: The Spark 2.4 support To: dev Cc: Nikolay Izhikov Content-Type: multipart/alternative; boundary="000000000000ecf92f0597a1a106" --000000000000ecf92f0597a1a106 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Alexey, Please help to understand what it means that 2.4 integration supports "95% of tests of 2.3". Does it mean that 5% of existing tests are failing and, basically, need to be fixed? - Denis On Mon, Nov 18, 2019 at 6:52 AM Alexey Zinoviev wrote: > Dear Nikolay Izhikov, I've recreated the PR for 2.4 initial support > > The last commit > > https://github.com/apache/ignite/pull/7058/commits/60386802299deedc6ed60b= f4736e922201a67fb8 > contains > real changes from Spark 2.3 > > I suggest to merge to master this initial solution with 95% support of > Spark 2.4 and continue work on known issues listed in JIRA > > This solution supports the new Spark version for all examples and 95% of > tests of 2.3. > > =D0=B2=D1=82, 1 =D0=BE=D0=BA=D1=82. 2019 =D0=B3. =D0=B2 08:48, Ivan Pavlu= khin : > > > Alexey, Nikolay, > > > > Thank you for sharing details! > > > > =D0=B2=D1=82, 1 =D0=BE=D0=BA=D1=82. 2019 =D0=B3. =D0=B2 07:42, Alexey Z= inoviev : > > > > > > Great talk and paper, I've learnt it last year > > > > > > =D0=BF=D0=BD, 30 =D1=81=D0=B5=D0=BD=D1=82. 2019 =D0=B3., 21:42 Nikola= y Izhikov : > > > > > > > Yes, I can :) > > > > > > > > =D0=92 =D0=9F=D0=BD, 30/09/2019 =D0=B2 11:40 -0700, Denis Magda =D0= =BF=D0=B8=D1=88=D0=B5=D1=82: > > > > > Nikolay, > > > > > > > > > > Would you be able to review the changes? I'm not sure there is a > > better > > > > candidate for now. > > > > > > > > > > - > > > > > Denis > > > > > > > > > > > > > > > On Mon, Sep 30, 2019 at 11:01 AM Nikolay Izhikov < > > nizhikov@apache.org> > > > > wrote: > > > > > > Hello, Ivan. > > > > > > > > > > > > I had a talk about internals of Spark integration in Ignite. > > > > > > It answers on question why we should use Spark internals. > > > > > > > > > > > > You can take a look at my meetup talk(in Russian) [1] or read a= n > > > > article if you prefer text [2]. > > > > > > > > > > > > [1] https://www.youtube.com/watch?v=3DCzbAweNKEVY > > > > > > [2] https://habr.com/ru/company/sberbank/blog/427297/ > > > > > > > > > > > > =D0=92 =D0=9F=D0=BD, 30/09/2019 =D0=B2 20:29 +0300, Alexey Zino= viev =D0=BF=D0=B8=D1=88=D0=B5=D1=82: > > > > > > > Yes, as I understand it uses Spark internals from the first > > commit))) > > > > > > > The reason - we take Spark SQL query execution plan and try t= o > > > > execute it > > > > > > > on Ignite cluster > > > > > > > Also we inherit a lot of Developer API related classes that > > could be > > > > > > > unstable. Spark has no good point for extension and this is a > > reason > > > > why we > > > > > > > should go deeper > > > > > > > > > > > > > > =D0=BF=D0=BD, 30 =D1=81=D0=B5=D0=BD=D1=82. 2019 =D0=B3. =D0= =B2 20:17, Ivan Pavlukhin < > > vololo100@gmail.com>: > > > > > > > > > > > > > > > Hi Alexey, > > > > > > > > > > > > > > > > As an external watcher very far from Ignite Spark > integration I > > > > would > > > > > > > > like to ask a humble question for my understanding. Why thi= s > > > > > > > > integration uses Spark internals? Is it a common approach f= or > > > > > > > > integrating with Spark? > > > > > > > > > > > > > > > > =D0=BF=D0=BD, 30 =D1=81=D0=B5=D0=BD=D1=82. 2019 =D0=B3. =D0= =B2 16:17, Alexey Zinoviev < > > > > zaleslaw.sin@gmail.com>: > > > > > > > > > > > > > > > > > > Hi, Igniters > > > > > > > > > I've started the work on the Spark 2.4 support > > > > > > > > > > > > > > > > > > We started the discussion here, in > > > > > > > > > https://issues.apache.org/jira/browse/IGNITE-12054 > > > > > > > > > > > > > > > > > > The Spark internals were totally refactored between 2.3 a= nd > > 2.4 > > > > versions, > > > > > > > > > main changes touches > > > > > > > > > > > > > > > > > > - External catalog and listeners refactoring > > > > > > > > > - Changes of HAVING operator semantic support > > > > > > > > > - Push-down NULL filters generation in JOIN plans > > > > > > > > > - minor changes in Plan Generation that should be > adopted > > in > > > > our > > > > > > > > > integration module > > > > > > > > > > > > > > > > > > I propose the initial solution here via creation of new > > module > > > > spark-2.4 > > > > > > > > > here https://issues.apache.org/jira/browse/IGNITE-12247 > and > > > > addition of > > > > > > > > > > > > > > > > new > > > > > > > > > profile spark-2.4 (to avoid possible clashes with another > > spark > > > > versions) > > > > > > > > > > > > > > > > > > Also I've transformed ticket to an Umbrella ticket and > > created a > > > > few > > > > > > > > > tickets for muted tests (around 7 from 211 tests are mute= d > > now) > > > > > > > > > > > > > > > > > > Please, if somebody interested in it, make an initial > review > > of > > > > modular > > > > > > > > > ignite structure and changes (without deep diving into > Spark > > > > code). > > > > > > > > > > > > > > > > > > And yes, the proposed code is a copy-paste of spark-ignit= e > > > > module with a > > > > > > > > > few fixes > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best regards, > > > > > > > > Ivan Pavlukhin > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > Ivan Pavlukhin > > > --000000000000ecf92f0597a1a106--