kylin-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Shoberg <jon.shob...@gmail.com>
Subject Re: Spark tuning within Kylin? Article? Resource?
Date Tue, 18 Dec 2018 03:16:29 GMT
Greatly appreciate the response.

I started there but after OOM errors I started to work on the settings for
my test lab. After minimal success thought to ask if there was something
more in-depth for tuning with other Kylin users found successful.

Right now I've gone to very basic configuration with dynamic allocation and
see if I can avoid the late-stage OOM errors.

J

On Mon, Dec 17, 2018 at 7:44 PM JiaTao Tao <taojiatao@gmail.com> wrote:

> Hope this may help: http://kylin.apache.org/docs/tutorial/cube_spark.html
>
> Jon Shoberg <jon.shoberg@gmail.com> 于2018年12月18日周二 上午2:34写道:
>
>> Is there a good/favorite article for tuning spark settings within Kylin?
>>
>> I finally have Spark (2.1.3 as distributed with Kylin 2.5.2) running on
>> my systems.
>>
>> My small data set (35M records) runs well the default settings.
>>
>> My medium data set (4B records, 40GB compressed source file, 5 measures,
>> 6 dimensions with low carnality) often dies at Step 3 (Extract Fact Table
>> Distinct Columns) with out of memory errors.
>>
>> After using exceptionally large memory settings the job completed but I'm
>> trying to see if there is an optimization possible.
>>
>> Any suggestions or ideas?  I've searched/read on spark tuning in general
>> but otherwise feeling I'm not making too much progress on optimizing with
>> the settings I've tried.
>>
>> Thanks!J
>>
>
>
> --
>
>
> Regards!
>
> Aron Tao
>

Mime
View raw message