incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: setNumBspTask
Date Mon, 18 Apr 2011 10:02:27 GMT
Let's assume that there are three slaves A, B, and C.

A and B are 100%...  but they should not start new step before
receiving msgs from C.

P.S., Of course if C is died, master will re-assign the C's task to A or B.

2011/4/18 Thomas Jungblut <thomas.jungblut@googlemail.com>:
> This is a bit offtopic here, but I think this is possible. We can maintain a
> list of active working grooms per job and just sync them. If a groom wants
> to leave he will be removed from the list.
>
> We can discuss this later in the fault tolerance issue.
>
> 2011/4/18 Edward J. Yoon <edward@udanax.org>
>
>> > groom can leave the sync process and the other grooms will advance
>> without waiting for the finished groom.
>>
>>
>> It's impossible.
>>
>> Sent from my iPhone
>>
>> On 2011. 4. 18., at 오후 6:08, Thomas Jungblut <
>> thomas.jungblut@googlemail.com> wrote:
>>
>> > Should I open a new issue for it?
>> >
>> > Agree with the sync barrier.
>> > In addition to that, we should add an equal version of voteForHalt()
>> where a
>> > groom can leave the sync process and the other grooms will advance
>> without
>> > waiting for the finished groom.
>> >
>> > 2011/4/18 Edward J. Yoon <edwardyoon@apache.org>
>> >
>> >>> I think we should document that this is a mandatory call to a job OR
>> set
>> >>> this in submitJobInternal of the BSPJobClient to the number of active
>> >>> grooms.
>> >>> What's your opinion on that?
>> >>
>> >> +1
>> >>
>> >> And,
>> >> It is highly related with HAMA-199. Our 'sync barrier' is very naive.
>> >> We should fix this problem as soon as possible.
>> >>
>> >> On Mon, Apr 18, 2011 at 3:20 AM, Thomas Jungblut
>> >> <thomas.jungblut@googlemail.com> wrote:
>> >>> Hi all,
>> >>>
>> >>> I played a bit with submitting BSP's from other applications and seen
>> >> that
>> >>> setting the num of BSP tasks is a mandatory action.
>> >>> Otherwise the job will hang forever with this output:
>> >>>
>> >>> 11/04/17 20:07:50 INFO bsp.BSPJobClient: Running job:
>> >> job_201104172007_0001
>> >>>> 11/04/17 20:07:53 INFO bsp.BSPJobClient: Current supersteps number:
0
>> >>>>
>> >>>
>> >>> I think we should document that this is a mandatory call to a job OR
>> set
>> >>> this in submitJobInternal of the BSPJobClient to the number of active
>> >>> grooms.
>> >>> What's your opinion on that?
>> >>>
>> >>> Regards
>> >>> Thomas
>> >>>
>> >>> --
>> >>> Thomas Jungblut
>> >>> Berlin
>> >>>
>> >>> mobile: 0170-3081070
>> >>>
>> >>> business: thomas.jungblut@testberichte.de
>> >>> private: thomas.jungblut@gmail.com
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Best Regards, Edward J. Yoon
>> >> http://blog.udanax.org
>> >> http://twitter.com/eddieyoon
>> >>
>> >
>> >
>> >
>> > --
>> > Thomas Jungblut
>> > Berlin
>> >
>> > mobile: 0170-3081070
>> >
>> > business: thomas.jungblut@testberichte.de
>> > private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Thomas Jungblut
> Berlin
>
> mobile: 0170-3081070
>
> business: thomas.jungblut@testberichte.de
> private: thomas.jungblut@gmail.com
>



-- 
Best Regards, Edward J. Yoon
http://blog.udanax.org
http://twitter.com/eddieyoon

Mime
View raw message