ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dmitriy Morozov <int.2...@gmail.com>
Subject Re: Sharing Spark RDDs with Ignite
Date Fri, 12 Feb 2016 20:29:49 GMT
Thanks Andrey!

It totally makes sense. I should have done a more accurate test. Appreciate
your help!

On 12 February 2016 at 17:31, Andrey Gura <agura@gridgain.com> wrote:

> Dmitry,
>
> I repeated your test. On my laptop it took about 2300 ms.
>
> Having in mind that RDD is lazy by nature I suggested that DataFrame is
> lazy too. So I add df.rdd().count() call in the code before RDD caching in
> order to measure execution time and got about 670 ms.
> After it igniteRDD.saveValues(df.rdd()) call takes about 1500 ms.
>
> For more accurate results I measured this operations in a loop and got
> about 700 ms for RDD caching on warmed up JVM.
>
> I created pull request for clarity:
> https://github.com/erasmas/ignite-playground/pull/1
>
> On Thu, Feb 11, 2016 at 3:20 PM, Dmitriy Morozov <int.256h@gmail.com>
> wrote:
>
>> Hi Valentin,
>>
>> Sorry, I realize I didn't get it right. I'm using IgniteRDD to save RDD
>> values now and IgniteCache to cache StructType.
>> I'm using a ~1mb Parquet file for testing which has ~75K rows. I noticed
>> that saving IgniteRDD is expensive, it takes about 4 seconds on my laptop.
>>  I tried both client and server mode for IgniteContext but still couldn't
>> make it faster.
>>
>> Here's the code
>> <https://github.com/erasmas/ignite-playground/blob/master/src/main/java/ignite/CachedRddExample.java>
>> that I tried. I'd appreciate if somebody could give a hint on how to make
>> it faster.
>>
>> Thanks!
>>
>> On 10 February 2016 at 21:55, vkulichenko <valentin.kulichenko@gmail.com>
>> wrote:
>>
>>> Hi Dmitry,
>>>
>>> What are you trying to achieve by putting the RDD into the cache as a
>>> single
>>> entry? If you want to save RDD data into the Ignite cache, it's better to
>>> create IgniteRDD and use its savePairs() or saveValues() methods. See [1]
>>> for details.
>>>
>>> [1]
>>>
>>> https://apacheignite-fs.readme.io/docs/ignitecontext-igniterdd#section-saving-values-to-ignite
>>>
>>> -Val
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-ignite-users.70518.x6.nabble.com/Sharing-Spark-RDDs-with-Ignite-tp2805p2941.html
>>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>>
>>
>>
>>
>> --
>> Kind regards,
>> Dima
>>
>
>
>
> --
> Andrey Gura
> GridGain Systems, Inc.
> www.gridgain.com
>



-- 
Kind regards,
Dima

Mime
View raw message