giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo (JIRA)" <j...@apache.org>
Subject [jira] [Created] (GIRAPH-508) Increase the limit on the number of partitions
Date Thu, 07 Feb 2013 00:12:12 GMT
Maja Kabiljo created GIRAPH-508:
-----------------------------------

             Summary: Increase the limit on the number of partitions
                 Key: GIRAPH-508
                 URL: https://issues.apache.org/jira/browse/GIRAPH-508
             Project: Giraph
          Issue Type: Improvement
            Reporter: Maja Kabiljo
            Assignee: Maja Kabiljo


We have the limit for total number of partitions of 2995. This is because of Zookeeper znode
limit of 1MB, and from the assumption that partition owner description can take 300 bytes.

In the simplest case, when checkpointing is not used and partitions don't move around, we
have 5 ints and hostname written per partition. If partitions move around we have one more
hostname and 2 ints. And when checkpointing is used we also have the path to checkpoint file
written.

For now, we can get rid of whole WorkerInfo description per partition, and just use taskIds,
since all WorkerInfos are written in the beginning. This will lead to having just 4 ints per
partition in the case when checkpointing is not used, and allow us to have much more partitions.

When checkpointing is used, we can keep the limit (still up it a bit), or have all workers
read partition metadata when restarting from checkpoint.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message