ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From raksja <shanmugkr...@gmail.com>
Subject Ignite behaving strange with Spark SharedRDD in AWS EMR Yarn Client Mode
Date Fri, 04 Aug 2017 22:53:00 GMT
Hi,

We are evaluating Ignite and our ultimate goal is to share an rdd between
multiple spark jobs. So one job can cache its computation in ignite and
other can use that for its computation.

We setup ignite server in master node & 6 worker nodes of AWS EMR cluster
and ran the spark-submit with ScalarSharedRDDExample provided in yarn client
mode. 

Multicast is not enabled in AWS, so we created a custom config file for
IgniteContext creation with S3Discovery and we are able to see the servers=7
(6w+1m).

https://github.com/apache/ignite/blob/master/examples/src/main/scala/org/apache/ignite/scalar/examples/spark/ScalarSharedRDDExample.scala
<https://github.com/apache/ignite/blob/master/examples/src/main/scala/org/apache/ignite/scalar/examples/spark/ScalarSharedRDDExample.scala>
 

Then we spark-submit the above example with one slight modification in yarn
client mode. With 1 executor 1 core.
new IgniteContext(sparkContext, CONFIG, true)  ->  true instead of false

 It gave all the logs for saving the values, then the strange thing started
to happen.

Foreach of the "take" task, it started spinning up multiple executors and in
each executor of course it started another IgniteContext. So the total
ignite servers became like 25 and then started printing the take values and
then it teared down all those igniteinstances as well as spark executors.

But overall it took lot of time. We even tried configuring a
RendezvousAffinityFunction to set only 1 partition, still no change. Most of
the documentation talks about things separately (like how to set it up in
aws, how to do sharedrdd). 



Just wanted to ask this forum, whether we are doing something wrong here.
Does any one tested it in yarn client mode? Is there any setup documentation
for running this in such mode? Also why its spinning up 38 different
executors itself, even though we specify only 1 to spark?

Any help would be much appreciated.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Ignite-behaving-strange-with-Spark-SharedRDD-in-AWS-EMR-Yarn-Client-Mode-tp16011.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Mime
View raw message