ignite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Николай Ижиков <nizhikov....@gmail.com>
Subject Spark+Ignite SQL syntax proposal
Date Thu, 05 Oct 2017 14:05:15 GMT
Hello, guys.

I’m working on IGNITE-3084 [1] “Spark Data Frames Support in Apache Ignite”
and have a proposal to discuss.

I want to provide a consistent way to query Ignite key-value caches from
Spark SQL engine.

To implement it I have to determine java class for the key and value.
It required for calculating schema for a Spark Data Frame.
As far as I know, there is no meta information for key-value cache in
Ignite for now.

If a regular data source is used, a user can provide key class and value
class throw options. Example:

```
val df = spark.read
  .format(IGNITE)
  .option("config", CONFIG)
  .option("cache", CACHE_NAME)
  .option("keyClass", "java.lang.Long")
  .option("valueClass", "java.lang.String")
  .load()

df.printSchema()

df.createOrReplaceTempView("testCache")

val igniteDF = spark.sql("SELECT key, value FROM testCache WHERE key >= 2
AND value like '%0'")
```

But If we use Ignite implementation of Spark catalog we don’t want to
register existing caches by hand.
Anton Vinogradov proposes syntax that I personally like very much:

*Let’s use following table name for a key-value cache -
`cacheName[keyClass,valueClass]`*

Example:

```
val df3 = igniteSession.sql("SELECT * FROM
`testCache[java.lang.Integer,java.lang.String]` WHERE key % 2 = 0")

df3.printSchema()

df3.show()
```

Thoughts?

[1] https://issues.apache.org/jira/browse/IGNITE-3084

--
Nikolay Izhikov
NIzhikov.dev@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message