hama-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ChiaHung Lin" <chl...@nuk.edu.tw>
Subject Re: Lock and Barrier Synchronization
Date Wed, 22 Jun 2011 09:12:08 GMT
From the code I observed, it seems that znodes created are consisted of peer names only (in
the form of `host:port'). Therefore, processes at different superstep share the flat namespace.
During iteration of each supersteps, the newer superstep process can not be distinguished
from the older one, resulting in process hanging. Adding superstep value to created znode
and filtering out znode of next superstep might solve the problem. 

But I haven't tested the code, so I may be wrong because of misunderstanding.  

-----Original message-----
From:Edward J. Yoon <edwardyoon@apache.org>
Date:Tue, 21 Jun 2011 17:20:21 +0900
Subject:Re: Lock and Barrier Synchronization

Especially, this can be problematic when locking a large number of BSPPeers.

On Tue, Jun 21, 2011 at 5:13 PM, Edward J. Yoon <edwardyoon@apache.org> wrote:
> Hi all,
> Recently I'm looking at HAMA-387.
> There's some problem related with lock and barrier synchronization.
> The problem is as soon as last one of lock files deleted (before
> completely escape from while loop at leaveBarrier method), others
> begin to create their lock file. So, sometimes, it causes hang.
> My temporary solution is 'Thread.sleep(200);'. Good but not perfect.
> If zk.getChildren() response is slower than 200 milliseconds, process
> will be hanged.
> Is there any other idea?
> Thanks.
> --
> Best Regards, Edward J. Yoon
> @eddieyoon

Best Regards, Edward J. Yoon

ChiaHung Lin
Department of Information Management
National University of Kaohsiung

View raw message