ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Николай Ижиков <nizhikov....@gmail.com>
Subject Re: Spark+Ignite SQL syntax proposal
Date Fri, 06 Oct 2017 03:49:51 GMT
Ok. Got it. Will remove key value support from catalog.

6 окт. 2017 г. 6:34 AM пользователь "Denis Magda" <dmagda@apache.org>
написал:

> I tend to agree with Val that key-value support seems excessive. My
> suggestion is to consider Ignite as a SQL database for this specific
> integration implementing only relevant functionality.
>
> —
> Denis
>
> > On Oct 5, 2017, at 5:41 PM, Valentin Kulichenko <
> valentin.kulichenko@gmail.com> wrote:
> >
> > Nikolay,
> >
> > I don't think we need this, especially with this kind of syntax which is
> > very confusing. Main use case for data frames is SQL, so let's
> concentrate
> > on it. We should use Ignite's SQL engine capabilities as much as
> possible.
> > If we see other use cases down the road, we can always support them.
> >
> > -Val
> >
> > On Thu, Oct 5, 2017 at 10:57 AM, Николай Ижиков <nizhikov.dev@gmail.com>
> > wrote:
> >
> >> Hello, Valentin.
> >>
> >> I implemented the ability to make Spark SQL Queries for both:
> >>
> >> 1.  Ignite SQL Table. Internally table described by QueryEntity with
> meta
> >> information about data.
> >> 2.  Key-Value cache - regular Ignite cache without meta information
> about
> >> stored data.
> >>
> >> In the second case, we have to know which types cache stores.
> >> So for this case, I propose use syntax I describe
> >>
> >>
> >> 2017-10-05 20:45 GMT+03:00 Valentin Kulichenko <
> >> valentin.kulichenko@gmail.com>:
> >>
> >>> Nikolay,
> >>>
> >>> I don't understand. Why do we require to provide key and value types in
> >>> SQL? What is the issue you're trying to solve with this syntax?
> >>>
> >>> -Val
> >>>
> >>> On Thu, Oct 5, 2017 at 7:05 AM, Николай Ижиков <nizhikov.dev@gmail.com
> >
> >>> wrote:
> >>>
> >>>> Hello, guys.
> >>>>
> >>>> I’m working on IGNITE-3084 [1] “Spark Data Frames Support in Apache
> >>> Ignite”
> >>>> and have a proposal to discuss.
> >>>>
> >>>> I want to provide a consistent way to query Ignite key-value caches
> >> from
> >>>> Spark SQL engine.
> >>>>
> >>>> To implement it I have to determine java class for the key and value.
> >>>> It required for calculating schema for a Spark Data Frame.
> >>>> As far as I know, there is no meta information for key-value cache in
> >>>> Ignite for now.
> >>>>
> >>>> If a regular data source is used, a user can provide key class and
> >> value
> >>>> class throw options. Example:
> >>>>
> >>>> ```
> >>>> val df = spark.read
> >>>>  .format(IGNITE)
> >>>>  .option("config", CONFIG)
> >>>>  .option("cache", CACHE_NAME)
> >>>>  .option("keyClass", "java.lang.Long")
> >>>>  .option("valueClass", "java.lang.String")
> >>>>  .load()
> >>>>
> >>>> df.printSchema()
> >>>>
> >>>> df.createOrReplaceTempView("testCache")
> >>>>
> >>>> val igniteDF = spark.sql("SELECT key, value FROM testCache WHERE key
> >>> = 2
> >>>> AND value like '%0'")
> >>>> ```
> >>>>
> >>>> But If we use Ignite implementation of Spark catalog we don’t want
to
> >>>> register existing caches by hand.
> >>>> Anton Vinogradov proposes syntax that I personally like very much:
> >>>>
> >>>> *Let’s use following table name for a key-value cache -
> >>>> `cacheName[keyClass,valueClass]`*
> >>>>
> >>>> Example:
> >>>>
> >>>> ```
> >>>> val df3 = igniteSession.sql("SELECT * FROM
> >>>> `testCache[java.lang.Integer,java.lang.String]` WHERE key % 2 = 0")
> >>>>
> >>>> df3.printSchema()
> >>>>
> >>>> df3.show()
> >>>> ```
> >>>>
> >>>> Thoughts?
> >>>>
> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-3084
> >>>>
> >>>> --
> >>>> Nikolay Izhikov
> >>>> NIzhikov.dev@gmail.com
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >> Nikolay Izhikov
> >> NIzhikov.dev@gmail.com
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message