giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hassan Eslami (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-1048) Redesign of out-of-core mechanism (first patch -- out-of-core mechanism keeping fixed number of partitions in memory)
Date Tue, 15 Mar 2016 02:30:33 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15194615#comment-15194615
] 

Hassan Eslami commented on GIRAPH-1048:
---------------------------------------

https://reviews.facebook.net/D54549

> Redesign of out-of-core mechanism (first patch -- out-of-core mechanism keeping fixed
number of partitions in memory)
> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: GIRAPH-1048
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1048
>             Project: Giraph
>          Issue Type: New Feature
>            Reporter: Hassan Eslami
>            Assignee: Hassan Eslami
>              Labels: out-of-memory
>
> The current out-of-core mechanism implemented in Giraph suffers from a few issues:
> - It does not integrate well with a flow-control mechanism in which rate of incoming/outgoing
messages are controlled according to available memory,
> - It does not control data generation/processing rate by compute/input threads, which
is crucial in input superstep, and also compute supersteps in some applications,
> - It does not utilize the disk bandwidth properly due to concurrent disk accesses (IO
interference),
> - It suffers from high overhead due to successive manual GC calls, even when the high-memory
pressure cannot be addressed by offloading data to disk,
> - And yet, it has a complicated design making it difficult to debug and improve upon.
> - It is very difficult to try different out-of-core policies, making it impossible to
tune the mechanism.
> A simple to tune/program, flexible, and yet efficient out-of-core infrastructure is needed
in Giraph. In this JIRA we propose a redesign of out-of-core mechanism, in which a) the logic
of IO operations, b)  the logic of out-of-core decisions, c) data-structures supporting out-of-core
operations, and d) the actual logic for the computation are 4 different decoupled entities.
Some IOCommands and an IOScheduler address the logic behind IO operations, an OutOfCoreEngine
and a MetaPartitionManager address the logic for out-of-core decisions, several disk-backed
data-structures are responsible to keep necessary data, and finally, the old in-memory computation
mechanism interact with the out-of-core infrastructure seamlessly.
> This JIRA is created to set the ground for the out-of-core infrastructure, and as an
initial proof-of-concept, a simple out-of-core policy using the mentioned infrastructure is
implemented. The out-of-core policy in this JIRA, also called fixed out-of-core policy, tries
to keep a certain (user defined) number of partitions in memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message