ignite-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pascoe Scholle <pascoescholletr...@gmail.com>
Subject Re: Job Stealing node not stealing jobs
Date Tue, 10 Sep 2019 09:19:39 GMT
Hello,

is there any update on this?

We have not been able to resolve this issue

Kind regards


On Wed, 04 Sep 2019 at 07:44, Pascoe Scholle <pascoescholletrash@gmail.com>
wrote:

> Hi,
>
> attached a small scala project. Just set the build path to src after
> building and compiling with sbt.
>
> We want to execute processes that happen outside the JVM. These processes
> can be extremely memory intensive which is why I am limiting the
> number of parallel jobs that can be executed on a machine.
>
> I have one desktop that has a lot more memory available and can thus
> execute more jobs in parallel. As all jobs take roughly the same amount of
> time, this machine will have completed its jobs much faster. I want it to
> then take jobs from the nodes started on weaker machines once it has
> completed all its tasks.
>
> Does that make sense?
>
> Hope this helps.
>
> BR,
> Pascoe
>
> On Tue, 3 Sep 2019 at 17:29, Andrei Aleksandrov <aealexsandrov@gmail.com>
> wrote:
>
>> Hi,
>>
>> Some remarks about job stealing SPI:
>>
>> 1)You have some nodes that can proceed the tasks of some compute job.
>> 2)Tasks will be executed in public thread pool by default:
>> https://apacheignite.readme.io/docs/thread-pools#section-public-pool
>> 3)If some node thread pool is busy then some task of compute job can be
>> executed on other node.
>>
>> In next cases it will not work:
>>
>> 1)In case if you choose specific node for your compute task
>> 2)In case if you do affinity call (the same as above but node will be
>> choose by affinity mapping)
>>
>> According to your case:
>>
>> It's not clear for me what exactly you try to do. Possible job stealing
>> didn't work because of your weak node began executions of some tasks in
>> public pool but just do it longer then faster one.
>>
>> Could you please share your full reproducer for investigation?
>>
>> BR,
>> Andrei
>>
>> 9/3/2019 1:43 PM, Pascoe Scholle пишет:
>> > HI there,
>> >
>> > I have asked this question, however I asked it under a different and
>> > resolved topic, so I posted the quest under a more suitable title. I
>> > hope thats ok
>> >
>> > We have tried to configure two compute server nodes one of which is
>> > running on a weaker machine. The node running on the more powerful
>> > machine always finished its tasks far before
>> > the weaker node and then sits idle.
>> >
>> > The node is not even sending a steal request, so I must have
>> > configured something wrong.
>> >
>> > I have attached the code for both nodes if you could kindly point out
>> > what I am missing , I would really appreciate it!
>> >
>> >
>>
>

Mime
View raw message