giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Lockup During Edge Saving
Date Mon, 15 Sep 2014 19:21:09 GMT
Looks like you're going out of memory, and it looks like it's your output
format's fault. As you're using a CSV, I feel you're trying to build a
single String line for each vertex, before you write it to HDFS. For
vertices with many edges this string might get big, which could be one of
your problems. Could you test it with a standard edge output format first?

On Fri, Sep 12, 2014 at 12:10 AM, Andrew Munsell <andrew@wizardapps.net>
wrote:

>  Now that I have the loading and computation completing successfully, I
> am having issues when saving the edges back to disk. During the saving
> step, the machines will get to ~1-2 partitions before the cluster freezes
> up entirely (as in, I can't even SSH into the machine or view the Hadoop
> web console).
>
> As in my message before, I have about 1.3 billion edges total (600 million
> undirected, converted using the reverser) and a cluster of 19 machines,
> each with 8 cores and 60 GB of RAM.
>
> I am also using a custom linked-list based OutEdges class because of the
> computation's high number of mutations of edge values (the byte array/big
> data byte array was not efficient for this use case).
>
> The specific computation I am running has three supersteps (0, 1, 2), and
> during supersteps 1 and 2 there is extremely high RAM usage (~97%), but the
> steps do complete. During saving this high RAM usage is maintained and does
> not increase significantly until the cluster freezes up.
>
> When saving the edges (I am using a custom edge output format as well,
> that is basically a CSV), are they flushed to disk immediately/in batches
> or is the entire output file held in memory before being flushed? If the
> latter, this seems like it might cause the same sort of behavior I see.
> Also, if this is the case, is there a way this can be changed?
>
> If this doesn't seem like the issue, does anyone have any ideas what may
> be causing the lockup?
>
> Thanks in advance!
>
> --
> Andrew
>
>
>



-- 
   Claudio Martella

Mime
View raw message