ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paolo Di Tommaso <paolo.ditomm...@gmail.com>
Subject Re: [Spark-Ignite] How to run exactly one Ignite worker in each Spark cluster node.
Date Thu, 07 Jul 2016 15:21:37 GMT
Hi Luis,

Thanks a lot for the link. It does not solve the problem that looks related
on how Spark manages the task execution, but helps a lot to understand how
works the Ignite integration behind the scene.


Cheers,
Paolo


On Wed, Jul 6, 2016 at 3:29 PM, Luis Mateos <luismattor@gmail.com> wrote:

> Hi Paolo,
>
> You might want to check Ignite spark module code:
>
>
> https://github.com/apache/ignite/blob/1.6.0/modules/spark/src/main/scala/org/apache/ignite/spark/IgniteContext.scala
>
> Basically, they use sc.parallelize to execute the function that does the
> ignite node start.
>
> Regards,
> Luis
>
>
> On 6 July 2016 at 14:20, Paolo Di Tommaso <paolo.ditommaso@gmail.com>
> wrote:
>
>> Hi,
>>
>> I'm using the Ignite embedded deployment to run an Ignite workload in a
>> Spark cluster.
>>
>> In my use case it's required to deploy exactly an Ignite worker for each
>> node in the Spark cluster. However I haven't found a way to do that.
>>
>>
>> Take in consideration this scenario: I'm running a 3 nodes Spark cluster
>> on AWS (1 driver, 2 workers, each node with 3 cores). I would run 2 Ignite
>> workers, one for each Spark worker.
>>
>> I'm using the following script:
>>
>> https://gist.github.com/pditommaso/660cbee09755b2b880099ab3bf2c609a
>>
>>
>> I've set `spark.executor.instances = 2` in order to deploy two Ignite
>> workers, indeed in the main log I can read the following:
>>
>>
>> 16/07/06 18:58:02 INFO spark.IgniteContext: Will start Ignite nodes on 2
>> workers
>>
>>
>>
>>  However what happens is that Ignite is launched only in one Spark node.
>>
>>
>> Looking in the same log, the following line seems to suggest the reason:
>>
>> 16/07/06 18:58:05 INFO scheduler.TaskSetManager: Starting task 0.0 in
>> stage 0.0 (TID 0, *ip-10-37-175-68*.eu-west-1.compute.internal,
>> partition 0,PROCESS_LOCAL, 2137 bytes)
>>
>> 16/07/06 18:58:05 INFO scheduler.TaskSetManager: Starting task 1.0 in
>> stage 0.0 (TID 1, *ip-10-37-175-68*.eu-west-1.compute.internal,
>> partition 1,PROCESS_LOCAL, 2194 bytes)
>>
>>
>> Spark is running two tasks to deploy the Ignite workers, but both of them
>> in the same node (*ip-10-37-175-68*).
>>
>> Is there any workaround to avoid this? or more in general, is it possible
>> to deploy exactly one Ignite worker for each node in the Spark cluster ?
>>
>>
>> Thanks a lot.
>>
>> Cheers,
>> Paolo
>>
>>
>
>

Mime
View raw message