giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: Giraph offloadPartition fails creation directory
Date Fri, 13 Sep 2013 09:27:55 GMT
I have no idea without the logs, especially when it happens rarely.


On Fri, Sep 13, 2013 at 12:33 AM, Alexander Asplund
<alexasplund@gmail.com>wrote:

> Actually, why is it saying it fails to create directory in the first
> place, when it is trying to write files?
>  On Sep 12, 2013 3:04 PM, "Alexander Asplund" <alexasplund@gmail.com>
> wrote:
>
>> I can also add that there is no such issue with DiskBackedMessageStore.
>> It successfully creates a large number of store files, and never starts
>> failing.
>> On Sep 12, 2013 2:11 PM, "Alexander Asplund" <alexasplund@gmail.com>
>> wrote:
>>
>>> It's very strange.. it is definitely failing on some partitions..
>>> currently the disk size of a offloading worker corresponda about to the
>>> size of its part of the graph... but the worker attempts to create
>>> additional partitions, and this fails.
>>> On Sep 12, 2013 2:07 PM, "Alexander Asplund" <alexasplund@gmail.com>
>>> wrote:
>>>
>>>> Actually, I take that back. It seems it does succeeded in creating
>>>> partitions - it just struggles with it sometimes. Should I be worried about
>>>> these errors if partition directories seem to be filling up?
>>>> On Sep 11, 2013 6:38 PM, "Claudio Martella" <claudio.martella@gmail.com>
>>>> wrote:
>>>>
>>>>> Giraph does not offload partitions or messages to HDFS in the
>>>>> out-of-core module. It uses local disk on the computing nodes. By defualt,
>>>>> it uses the tasktracker local directory where for example the distributed
>>>>> cache is stored.
>>>>>
>>>>> Could you provide the stacktrace Giraph is spitting when failing?
>>>>>
>>>>>
>>>>> On Thu, Sep 12, 2013 at 12:54 AM, Alexander Asplund <
>>>>> alexasplund@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm still trying to get Giraph to work on a graph that requires more
>>>>>> memory that is available. The problem is that when the Workers try
to
>>>>>> offload partitions, the offloading fails. The DiskBackedPartitionStore
>>>>>> fails to create the directory
>>>>>> _bsp/_partitions/job-xxxx/part-vertices-xxx (roughly from recall).
>>>>>>
>>>>>> The input or computation will then continue for a while, which I
>>>>>> believe is because it is still managing to hold everything in memory
-
>>>>>> but at some point it reaches the limit where there simply is no more
>>>>>> heap space, and it crashes with OOM.
>>>>>>
>>>>>> Has anybody had this problem with giraph failing to make HDFS
>>>>>> directories?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>    Claudio Martella
>>>>>    claudio.martella@gmail.com
>>>>>
>>>>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message