giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roman Shaposhnik (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-747) BspServiceMaster finishes ZooKeeper cleanup without waiting for all workers to complete
Date Tue, 11 Feb 2014 23:04:20 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13898460#comment-13898460
] 

Roman Shaposhnik commented on GIRAPH-747:
-----------------------------------------

[~initialcontext] any chance we can fix this for 1.1.0? I guess you're the resident Giraph-on-YARN
expert ;-)

> BspServiceMaster finishes ZooKeeper cleanup without waiting for all workers to complete
> ---------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-747
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-747
>             Project: Giraph
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Chuan Lei
>            Assignee: Chuan Lei
>             Fix For: 1.1.0
>
>         Attachments: GIRAPH-747.v1.patch
>
>
> In BspServiceMaster, the function cleanUpZooKeeper should wait for the number of workers
and masters to complete. However, it appears that maxTasks only takes workers into consideration.
Consequently, the worker straggler may fail to report to the ZooKeeper due to the path gets
removed too early. This will cause No lease on path File does not exist exception at runtime.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message