ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Николай Ижиков <nizhikov....@gmail.com>
Subject Re: Spark+Ignite SQL syntax proposal
Date Thu, 05 Oct 2017 17:57:21 GMT
Hello, Valentin.

I implemented the ability to make Spark SQL Queries for both:

1.  Ignite SQL Table. Internally table described by QueryEntity with meta
information about data.
2.  Key-Value cache - regular Ignite cache without meta information about
stored data.

In the second case, we have to know which types cache stores.
So for this case, I propose use syntax I describe


2017-10-05 20:45 GMT+03:00 Valentin Kulichenko <
valentin.kulichenko@gmail.com>:

> Nikolay,
>
> I don't understand. Why do we require to provide key and value types in
> SQL? What is the issue you're trying to solve with this syntax?
>
> -Val
>
> On Thu, Oct 5, 2017 at 7:05 AM, Николай Ижиков <nizhikov.dev@gmail.com>
> wrote:
>
> > Hello, guys.
> >
> > I’m working on IGNITE-3084 [1] “Spark Data Frames Support in Apache
> Ignite”
> > and have a proposal to discuss.
> >
> > I want to provide a consistent way to query Ignite key-value caches from
> > Spark SQL engine.
> >
> > To implement it I have to determine java class for the key and value.
> > It required for calculating schema for a Spark Data Frame.
> > As far as I know, there is no meta information for key-value cache in
> > Ignite for now.
> >
> > If a regular data source is used, a user can provide key class and value
> > class throw options. Example:
> >
> > ```
> > val df = spark.read
> >   .format(IGNITE)
> >   .option("config", CONFIG)
> >   .option("cache", CACHE_NAME)
> >   .option("keyClass", "java.lang.Long")
> >   .option("valueClass", "java.lang.String")
> >   .load()
> >
> > df.printSchema()
> >
> > df.createOrReplaceTempView("testCache")
> >
> > val igniteDF = spark.sql("SELECT key, value FROM testCache WHERE key >= 2
> > AND value like '%0'")
> > ```
> >
> > But If we use Ignite implementation of Spark catalog we don’t want to
> > register existing caches by hand.
> > Anton Vinogradov proposes syntax that I personally like very much:
> >
> > *Let’s use following table name for a key-value cache -
> > `cacheName[keyClass,valueClass]`*
> >
> > Example:
> >
> > ```
> > val df3 = igniteSession.sql("SELECT * FROM
> > `testCache[java.lang.Integer,java.lang.String]` WHERE key % 2 = 0")
> >
> > df3.printSchema()
> >
> > df3.show()
> > ```
> >
> > Thoughts?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-3084
> >
> > --
> > Nikolay Izhikov
> > NIzhikov.dev@gmail.com
> >
>



-- 
Nikolay Izhikov
NIzhikov.dev@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message