incubator-hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edward J. Yoon" <edwardy...@apache.org>
Subject Re: Hang problem
Date Fri, 23 Sep 2011 15:46:55 GMT
In other words, all tasks should be entered into next step until whole
job is completed successfully.

On Sat, Sep 24, 2011 at 12:37 AM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> According to BSPMaster log messages, a few tasks of all are finished
> with SUCCEEDED status during the iterations. If I remember correctly,
> child processes calls bspPeer.close() finally.
>
> Then yes, others will be hanged at the step of comparing the size of
> znode and initial task size.
>
> I wonder what happens if some task no longer need to communicate with others?
>
> On Fri, Sep 23, 2011 at 11:59 PM, Thomas Jungblut
> <thomas.jungblut@googlemail.com> wrote:
>> Well, for SSSP example it might be correct.
>> But you faced the hanging problems in randbench, too.
>>
>> Moreover, we have to implement our own mechanisms for high availability if
>>> we have own sync master server.
>>>
>>
>> +1
>>
>> 2011/9/23 Edward J. Yoon <edwardyoon@apache.org>
>>
>>> As I mentioned before, it's not a ZK problem.
>>>
>>> Moreover, we have to implement our own mechanisms for high availability if
>>> we have own sync master server.
>>>
>>> Sent from my iPad
>>>
>>> On Sep 23, 2011, at 11:01 PM, Thomas Jungblut <
>>> thomas.jungblut@googlemail.com> wrote:
>>>
>>> > I have made a github for that:
>>> > https://github.com/thomasjungblut/barriersync
>>> >
>>> > Check it out into your eclipse (the root directory failed for whatever
>>> > reason).
>>> > Start the server and then the clientemulator.
>>> > Works like a real charm.
>>> >
>>> > Please consider this as an alternative. We should not roll out a 4.0
>>> release
>>> > with a not working barrier sync.
>>> >
>>> > 2011/9/23 Thomas Jungblut <thomas.jungblut@googlemail.com>
>>> >
>>> >> Won't much different.
>>> >>>
>>> >>
>>> >> Let's see.
>>> >>
>>> >> 2011/9/23 Edward J. Yoon <edwardyoon@apache.org>
>>> >>
>>> >>> What happens if some task no longer need to communicate with others?
>>> >>>
>>> >>> I didn't look at the code recently but I guess that the problem
is
>>> >>> related with comparison of znode size and task size.
>>> >>>
>>> >>>> I am going to write a RPC barrier sync. Zookeeper sucks in this
case.
>>> >>>
>>> >>> Won't much different. Let's focusing on NG integration and In/Output
>>> >>> system.
>>> >>>
>>> >>> On Fri, Sep 23, 2011 at 8:21 PM, Thomas Jungblut
>>> >>> <thomas.jungblut@googlemail.com> wrote:
>>> >>>> I am going to write a RPC barrier sync. Zookeeper sucks in this
case.
>>> >>>>
>>> >>>> 2011/9/23 Edward J. Yoon <edwardyoon@apache.org>
>>> >>>>
>>> >>>>> P.S., Tested on 16 nodes using 10 tasks per node.
>>> >>>>>
>>> >>>>> On Fri, Sep 23, 2011 at 7:19 PM, Edward J. Yoon <
>>> edwardyoon@apache.org
>>> >>>>
>>> >>>>> wrote:
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> Today I ran the sssp example with 4GB sample file.
>>> >>>>>>
>>> >>>>>> At 32th step, some tasks are finished and others hang
forever.
>>> >>>>>>
>>> >>>>>> Could anyone figure out this problem?
>>> >>>>>>
>>> >>>>>> Plus, there're too many INFO-level logs. Let's reduce
them.
>>> >>>>>>
>>> >>>>>> Thanks.
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Best Regards, Edward J. Yoon
>>> >>>>>> @eddieyoon
>>> >>>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>>
>>> >>>>> --
>>> >>>>> Best Regards, Edward J. Yoon
>>> >>>>> @eddieyoon
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> --
>>> >>>> Thomas Jungblut
>>> >>>> Berlin
>>> >>>>
>>> >>>> mobile: 0170-3081070
>>> >>>>
>>> >>>> business: thomas.jungblut@testberichte.de
>>> >>>> private: thomas.jungblut@gmail.com
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Best Regards, Edward J. Yoon
>>> >>> @eddieyoon
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Thomas Jungblut
>>> >> Berlin
>>> >>
>>> >> mobile: 0170-3081070
>>> >>
>>> >> business: thomas.jungblut@testberichte.de
>>> >> private: thomas.jungblut@gmail.com
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thomas Jungblut
>>> > Berlin
>>> >
>>> > mobile: 0170-3081070
>>> >
>>> > business: thomas.jungblut@testberichte.de
>>> > private: thomas.jungblut@gmail.com
>>>
>>
>>
>>
>> --
>> Thomas Jungblut
>> Berlin
>>
>> mobile: 0170-3081070
>>
>> business: thomas.jungblut@testberichte.de
>> private: thomas.jungblut@gmail.com
>>
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon
>



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Mime
View raw message