kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Li Yang <liy...@apache.org>
Subject Re: Kylin + SparkSQL integration
Date Fri, 24 Mar 2017 23:31:24 GMT
> taking advantage of underlaying datasource capabilities (predicate
pushdown, projection etc) is important to improve query performance.

That is very true. There was discussion about replacing HBase with Cassandra
<http://apache-kylin.74782.x6.nabble.com/Cassandra-instead-of-HBase-in-Kylin-td2688.html>
previously. And the worry is lack of coprocessor will prevent predicate &
aggregation pushdown. Similar concern exists for Kudu.

Cheers
Yang

On Fri, Mar 24, 2017 at 12:50 AM, Nirav Patel <npatel@xactlycorp.com> wrote:

> Thanks for logging those improvements. I think decision about replacing
> Hbase or using any other nosql datastore for storing cubes would be based
> on many factors but one important I can think of is the query
> engine/optimizer of all of those datasources. I think taking advantage of
> underlaying datasource capabilities (predicate pushdown, projection etc) is
> important to improve query performance.
>
> Cheers,
> Nirav
>
> On Mon, Mar 20, 2017 at 12:23 PM, Li Yang <liyang@apache.org> wrote:
>
>> Hi Nirav,
>>
>> Glad to see you on the mailing list!!
>>
>> Yes, this is a great idea and it is on the roadmap. (This reminds me, I
>> should update the roadmap on kylin website soon.)
>>
>> However there are many moving parts that affect how we approach it. E.g.
>>
>> - If coprocessor is retired, do we still need HBase?
>> - If HBase is retired, what is the alternative storage? How about
>> metadata?
>> - There are other ways to integrate SparkSQL (KYLIN-2515), how do they
>> fit in...
>>
>> There are many work in this direction, I would say.
>>
>> Cheers
>> Yang
>>
>> On Tue, Mar 21, 2017 at 2:05 AM, Nirav Patel <npatel@xactlycorp.com>
>> wrote:
>>
>>> Hi,
>>>
>>> In recent strata conference I raised a question if kylin can support
>>> sparkSQL as a query engine or have a kylin query resultset converted into
>>> spark DataSet(DataFrame) on which user can perform further distributed
>>> computation.
>>> Reason are
>>> 1) some flavor of Hbase doesnt support co-processor
>>> 2) SparkSql UDF  much easier to develop then hbase coprocessor
>>> 3) User can write their own spark UDF and run any custom aggregation
>>>
>>> Is this on roadmap ?
>>>
>>> Thanks,
>>> Nirav
>>>
>>>
>>>
>>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>>
>>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>>> <https://twitter.com/Xactly>  [image: Facebook]
>>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>>> <http://www.youtube.com/xactlycorporation>
>>
>>
>>
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>
>

Mime
View raw message