giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-747) BspServiceMaster finishes ZooKeeper cleanup without waiting for all workers to complete
Date Thu, 30 Jan 2014 19:52:10 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13886989#comment-13886989
] 

Eli Reisman commented on GIRAPH-747:
------------------------------------

Hey, reviewing this. I recall this issue I thought I was shimming this number somewhere else?
The reason is that BspServiceMaster is also used by non-YARN and I didn't want to break or
alter the shared code.

Could another non-YARN Giraph committer take a look and see if this change is safe? If not
we should def commit this. If so, maybe another (ugh) munge flag here will suffice?


> BspServiceMaster finishes ZooKeeper cleanup without waiting for all workers to complete
> ---------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-747
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-747
>             Project: Giraph
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Chuan Lei
>            Assignee: Chuan Lei
>             Fix For: 1.0.0
>
>         Attachments: GIRAPH-747.v1.patch
>
>
> In BspServiceMaster, the function cleanUpZooKeeper should wait for the number of workers
and masters to complete. However, it appears that maxTasks only takes workers into consideration.
Consequently, the worker straggler may fail to report to the ZooKeeper due to the path gets
removed too early. This will cause No lease on path File does not exist exception at runtime.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message