ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrey Gura <ag...@gridgain.com>
Subject Re: Sharing Spark RDDs with Ignite
Date Fri, 12 Feb 2016 15:31:17 GMT
Dmitry,

I repeated your test. On my laptop it took about 2300 ms.

Having in mind that RDD is lazy by nature I suggested that DataFrame is
lazy too. So I add df.rdd().count() call in the code before RDD caching in
order to measure execution time and got about 670 ms.
After it igniteRDD.saveValues(df.rdd()) call takes about 1500 ms.

For more accurate results I measured this operations in a loop and got
about 700 ms for RDD caching on warmed up JVM.

I created pull request for clarity:
https://github.com/erasmas/ignite-playground/pull/1

On Thu, Feb 11, 2016 at 3:20 PM, Dmitriy Morozov <int.256h@gmail.com> wrote:

> Hi Valentin,
>
> Sorry, I realize I didn't get it right. I'm using IgniteRDD to save RDD
> values now and IgniteCache to cache StructType.
> I'm using a ~1mb Parquet file for testing which has ~75K rows. I noticed
> that saving IgniteRDD is expensive, it takes about 4 seconds on my laptop.
>  I tried both client and server mode for IgniteContext but still couldn't
> make it faster.
>
> Here's the code
> <https://github.com/erasmas/ignite-playground/blob/master/src/main/java/ignite/CachedRddExample.java>
> that I tried. I'd appreciate if somebody could give a hint on how to make
> it faster.
>
> Thanks!
>
> On 10 February 2016 at 21:55, vkulichenko <valentin.kulichenko@gmail.com>
> wrote:
>
>> Hi Dmitry,
>>
>> What are you trying to achieve by putting the RDD into the cache as a
>> single
>> entry? If you want to save RDD data into the Ignite cache, it's better to
>> create IgniteRDD and use its savePairs() or saveValues() methods. See [1]
>> for details.
>>
>> [1]
>>
>> https://apacheignite-fs.readme.io/docs/ignitecontext-igniterdd#section-saving-values-to-ignite
>>
>> -Val
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-ignite-users.70518.x6.nabble.com/Sharing-Spark-RDDs-with-Ignite-tp2805p2941.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>
>
> --
> Kind regards,
> Dima
>



-- 
Andrey Gura
GridGain Systems, Inc.
www.gridgain.com

Mime
View raw message