giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-326) Writing input splits to ZooKeeper in parallel
Date Fri, 14 Sep 2012 18:28:07 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13456009#comment-13456009
] 

Eli Reisman commented on GIRAPH-326:
------------------------------------

This makes a lot of sense. I have seen ZooKeeper really bog down when lots of writes and reads
are all occurring concurrently and the quorum is syncing all the time.

On the other hand, I have run countless jobs with many machines and many inputsplits, and
have not encountered a big slowdown in this initial write phase. The threads should be accessing
different znode path to write splits to all the time. What was the setup where this problem
is encountered? Lots of machines, or few machines and lots of splits?

Interesting stuff, I look trying this out & hearing more about this problem.

                
> Writing input splits to ZooKeeper in parallel
> ---------------------------------------------
>
>                 Key: GIRAPH-326
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-326
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>         Attachments: GIRAPH-326.patch
>
>
> (Posting issue and the patch from a colleague)
> Writing input splits to zookeeper can take a lot of time. From his experiments: serial
2m45s, with 16 cores 15s.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message