giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hassan Eslami (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-1066) Functional adaptive out-of-core mechanism
Date Thu, 19 May 2016 02:53:12 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-1066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15290341#comment-15290341
] 

Hassan Eslami commented on GIRAPH-1066:
---------------------------------------

https://reviews.facebook.net/D55479

> Functional adaptive out-of-core mechanism
> -----------------------------------------
>
>                 Key: GIRAPH-1066
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1066
>             Project: Giraph
>          Issue Type: New Feature
>          Components: bsp, graph
>            Reporter: Hassan Eslami
>            Assignee: Hassan Eslami
>
> In this JIRA we propose the following contributions to the out-of-core mechanism:
> • A simpler API is provided to try various out-of-core policies using the basic infrastructure
proposed in GIRAPH-1048. This new API helps developers of out-of-core policies to only focus
on the out-of-core logic, rather than the complications in multi-threading, disk interactions,
etc. The policy logic is abstracted out as much as possible to make it as simple as possible
to develop and try other out-of-core policies.
> • Two adaptive out-of-core policies are implemented using the proposed API. One is
based on few recent GC behaviors, and the other is based on some user-defined thresholds to
control the memory pressure. With the adaptive out-of-core policies, the job automatically
uses secondary storage devices in case the data cannot fit into memory. Also, if at some point
in the computation the memory pressure goes down, the spilled data to secondary storage will
be automatically loaded to memory again.
> • The out-of-core infrastructure is integrated with message flow control proposed in
GIRAPH-1027. Using credit-based flow control, an out-of-core policy can predict the amount
of memory usage by messages in a near future, hence the policy can have a fine control over
messages and their memory footprint.
> • A new feature, called data generation tethering, is also added. This feature let
the out-of-core policy to decide how many threads (input/compute) should be active at each
moment, indirectly controlling the rate of data generation, and in turn, controlling the memory
footprint of graph data.
> With this JIRA landed, we will have a full-functional out-of-core infrastructure preventing
any reasonable job to fail due to OOM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message