ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikolay Izhikov <nizhi...@apache.org>
Subject Re: [SparkDataFrame] Query Optimization. Prototype
Date Tue, 13 Feb 2018 16:07:54 GMT
Hello, Valentin.

> When you're talking about join optimization, what exactly are you referring to?

I'm referring to my PR [1]
Currently, it contains transformation from Spark joins to Ignite joins [2]

But, if I understand Vladimir answer right, for now, we don't *fully* support SQL join queries.

Sometimes it will work just right, in other cases, it will throw an exception due Ignite internal
implementation.

Please, see my example [3].
Query from line 4 will throw an exception.
The same query from line 10 will succeed, because of index creation.

Both of them syntactically correct.

> Unfortunately, at this moment we do not have complete list of all restrictions on our
joins, because a lot of work is delegated to H2. 
> In some unsupported scenarios we throw an exception. 
> In other cases we return incorrect results silently (e.g. if you do not co-locate data
and forgot to set "distributed joins" flag). 
> We have a plan to perform excessive testing of joins (both co-located and distributed)
and list all known limitations. 
> This would require writing a lot of unit tests to cover various scenarios. 
> I think we will have this information in a matter of 1-2 months.

[1] https://github.com/apache/ignite/pull/3397
[2] https://github.com/apache/ignite/pull/3397/files#diff-5a861613530bbce650efa50d553a0e92R227
[3] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b

В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> Nikolay,
> 
> When you're talking about join optimization, what exactly are you referring to?
> 
> Since other parts of data frames integration are already merged, I think it's a good
time to resurrect this thread? Does it make sense to review it right now? Or you want to make
some more changes?
> 
> -Val
> 
> On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <vozerov@gridgain.com> wrote:
> > Hi Nikolay,
> > 
> > I am not sure if ticket for DECIMAL column metadata exists. If you haven't find
one under "sql" component, please feel free to create it on your own. As far as testing of
joins, I think it makes sense to start working on it when we finish ANSI compliance testing
which is already in progress.
> > 
> > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <nizhikov.dev@gmail.com>
wrote:
> > > Hello, Vladimir.
> > > 
> > > Thank you for an answer.
> > > 
> > > > Do you mean whether it is possible to read it from table metadata?
> > > 
> > > Yes, you are right.
> > > I want to read scale and precision of DECIMAL column from table metadata.
> > > 
> > > > This will be fixed at some point in future, but I do not have any dates
at the moment.
> > > 
> > > Is there ticket for it? I can't find it via jira search
> > > 
> > > > at this moment we do not have complete list of all restrictions on our
joins, because a lot of work is delegated to H2.
> > > > In some unsupported scenarios we throw an exception.
> > > > In other cases we return incorrect results silently (e.g. if you do not
co-locate data and forgot to set "distributed joins" flag).
> > > 
> > > Guys, Val, may be we should exclude join optimization from IGNITE-7077 while
we haven't all limitation on the hand?
> > > 
> > > > We have a plan to perform excessive testing of joins (both co-located
and distributed) and list all known limitations.
> > > 
> > > Can I help somehow with this activity?
> > > 
> > > 
> > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > Hi Nikolay,
> > > >
> > > > Could you please clarify your question about scale and precision? Do you
mean whether it is possible to read it from table metadata? If yes, it is not possible at
the moment unfortunately - we do not store information about lengths, scales and precision,
only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed
at some point in future, but I do not have any dates at the moment.
> > > >
> > > > Now about joins - Denis, I think you provided wrong link to our internal
GridGain docs where we accumulate information about ANSI compatibility and which will are
going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed
about. The question was about limitation of our joins which has nothing to do with ANSI standard.
Unfortunately, at this moment we do not have complete list of all restrictions on our joins,
because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception.
In other cases we return incorrect results silently (e.g. if you do not co-locate data and
forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins
(both co-located and distributed) and list all known limitations. This would require writing
a lot of unit tests to cover various scenarios. I think we will have this information in a
matter of 1-2 months.
> > > >
> > > > Vladimir.
> > > >
> > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <dmagda@apache.org>
wrote:
> > > > > Agree. The unsupported functions should be mentioned on the page
that will cover Ignite ANSI-99 compliance. We have first results available for CORE features
of the specification:
> > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99
<https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > >
> > > > > That’s on my radar. I’ll take care of this.
> > > > >
> > > > > —
> > > > > Denis
> > > > >
> > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <dsetrakyan@apache.org>
wrote:
> > > > > >
> > > > > > I think we need a page listing the unsupported functions with
explanation
> > > > > > why, which is either it does not make sense in Ignite or is
planned in
> > > > > > future release.
> > > > > >
> > > > > > Sergey, do you think you will be able to do it?
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <sergey.puchnin@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > >>
> > > > > >> I've updated the documentation for next system functions:
> > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > >>
> > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > >>
> > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE,
DATABASE,
> > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY,
CURRVAL,
> > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE,
USER,
> > > > > >> H2VERSION
> > > > > >>
> > > > > >> Also an issue was created for review current documentation:
> > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > >>
> > > > > >> --
> > > > > >> BR,
> > > > > >> Serge
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > >>
> > > > >
> > > >
> > > >
> 
> 
Mime
View raw message