ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pascoe Scholle <pascoescholletr...@gmail.com>
Subject Re: Job Stealing node not stealing jobs
Date Tue, 10 Sep 2019 10:01:13 GMT
Thanks for the prompt response. I have looked the
WeightedRandomLoadBalancingSpi. It does not look like one can set the
number of parallel jobs though and this is big requirement. Also, it is
inevitable that there will be nodes which will sit idle, due to the nature
of jobs that will be deployed on the nodes and the job stealer just seems
like the perfect solution. Regardless, I have used the code provided for
the job stealing spi on the docs page and it isnt functioning as intended.


On Tue, 10 Sep 2019 at 11:34, Stephen Darlington <
stephen.darlington@gridgain.com> wrote:

> I don’t know the answer to your jon stealing question, but I do wonder if
> that’s the right configuration for your requirements. Why not use the
> weighted load balancer (https://apacheignite.readme.io/docs/load-balancing)?
> That’s designed to work in cases where nodes are of differing sizes.
>
> Regards,
> Stephen
>
> On 10 Sep 2019, at 10:19, Pascoe Scholle <pascoescholletrash@gmail.com>
> wrote:
>
> Hello,
>
> is there any update on this?
>
> We have not been able to resolve this issue
>
> Kind regards
>
>
> On Wed, 04 Sep 2019 at 07:44, Pascoe Scholle <pascoescholletrash@gmail.com>
> wrote:
>
>> Hi,
>>
>> attached a small scala project. Just set the build path to src after
>> building and compiling with sbt.
>>
>> We want to execute processes that happen outside the JVM. These processes
>> can be extremely memory intensive which is why I am limiting the
>> number of parallel jobs that can be executed on a machine.
>>
>> I have one desktop that has a lot more memory available and can thus
>> execute more jobs in parallel. As all jobs take roughly the same amount of
>> time, this machine will have completed its jobs much faster. I want it to
>> then take jobs from the nodes started on weaker machines once it has
>> completed all its tasks.
>>
>> Does that make sense?
>>
>> Hope this helps.
>>
>> BR,
>> Pascoe
>>
>> On Tue, 3 Sep 2019 at 17:29, Andrei Aleksandrov <aealexsandrov@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> Some remarks about job stealing SPI:
>>>
>>> 1)You have some nodes that can proceed the tasks of some compute job.
>>> 2)Tasks will be executed in public thread pool by default:
>>> https://apacheignite.readme.io/docs/thread-pools#section-public-pool
>>> 3)If some node thread pool is busy then some task of compute job can be
>>> executed on other node.
>>>
>>> In next cases it will not work:
>>>
>>> 1)In case if you choose specific node for your compute task
>>> 2)In case if you do affinity call (the same as above but node will be
>>> choose by affinity mapping)
>>>
>>> According to your case:
>>>
>>> It's not clear for me what exactly you try to do. Possible job stealing
>>> didn't work because of your weak node began executions of some tasks in
>>> public pool but just do it longer then faster one.
>>>
>>> Could you please share your full reproducer for investigation?
>>>
>>> BR,
>>> Andrei
>>>
>>> 9/3/2019 1:43 PM, Pascoe Scholle пишет:
>>> > HI there,
>>> >
>>> > I have asked this question, however I asked it under a different and
>>> > resolved topic, so I posted the quest under a more suitable title. I
>>> > hope thats ok
>>> >
>>> > We have tried to configure two compute server nodes one of which is
>>> > running on a weaker machine. The node running on the more powerful
>>> > machine always finished its tasks far before
>>> > the weaker node and then sits idle.
>>> >
>>> > The node is not even sending a steal request, so I must have
>>> > configured something wrong.
>>> >
>>> > I have attached the code for both nodes if you could kindly point out
>>> > what I am missing , I would really appreciate it!
>>> >
>>> >
>>>
>>
>
>

Mime
View raw message