giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Reisman (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-249) Move part of the graph out-of-core when memory is low
Date Tue, 17 Jul 2012 01:27:34 GMT


Eli Reisman commented on GIRAPH-249:

I did test against trunk, I what I meant above is that I tried a small sample of data at first
compared to what I was running successfully on the other patches and it still died. I wish
I could send you the metrics (I can't, but you could install GIRAPH-232 and set up a Graphite
server on your local machine to see on your browser, its dramatic what a great view under
the hood you get) but yes please take my word, its all about input super step right now. If
we get through that better, the thing will scale a long way before you have to worry about
fixing the computation steps.

I agree about trying with a lower mem ration on config. I would say you might even be much
safer at ".15" I see failures at 92% to 94% memory ratio usage in my metrics, no matter the
patches involved. if it gets above that, no chance for that worker to survive the input step.
85% usage seems to be totally safe.

I might be working other project this week, but I will try to keep an eye on this and help
test more if you like. It will be almost impossible to know if this patch is doing what it
should if you are not testing on a cluster. Its really great code, nice work either way.

> Move part of the graph out-of-core when memory is low
> -----------------------------------------------------
>                 Key: GIRAPH-249
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Alessandro Presta
>            Assignee: Alessandro Presta
>         Attachments: GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch, GIRAPH-249.patch,
> There has been some talk about Giraph's scaling limitations due to keeping the whole
graph and messages in RAM.
> We need to investigate methods to fall back to disk when running out of memory, while
gracefully degrading performance.
> This issue is for graph storage. Messages should probably be a separate issue, although
the interplay between the two is crucial.
> We should also discuss what are our primary goals here: completing a job (albeit slowly)
instead of failing when the graph is too big, while still encouraging memory optimizations
and high-memory clusters; or restructuring Giraph to be as efficient as possible in disk mode,
making it almost a standard way of operating.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message