giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Suijian Zhou <suijian.z...@gmail.com>
Subject Re: Giraph program stucks.
Date Fri, 07 Mar 2014 15:42:24 GMT
The current setting is:
  <name>mapred.child.java.opts</name>
  <value>-Xmx6144m -XX:+UseParallelGC -mx1024m -XX:MaxHeapFreeRatio=10
-XX:MinHeapFreeRatio=10</value>

Is 6144MB enough( for each task tracker)? I.e: I have 39 nodes to process
the 8*2GB input files.

  Best Regards,
  Suijian



2014-03-07 9:21 GMT-06:00 Claudio Martella <claudio.martella@gmail.com>:

> this setting won't be used by Giraph (or by any mapreduce application),
> but by the hadoop infrastructure itself.
> you should use mapred.child.java.opts instead.
>
>
> On Fri, Mar 7, 2014 at 4:19 PM, Suijian Zhou <suijian.zhou@gmail.com>wrote:
>
>> Hi, Claudio,
>>   I have set the following when ran the program:
>> export HADOOP_DATANODE_OPTS="-Xmx10g"
>> and
>> export HADOOP_HEAPSIZE=30000
>>
>> in hadoop-env.sh and restarted hadoop.
>>
>>   Best Regards,
>>   Suijian
>>
>>
>>
>> 2014-03-06 17:29 GMT-06:00 Claudio Martella <claudio.martella@gmail.com>:
>>
>> did you actually increase the heap?
>>>
>>>
>>> On Thu, Mar 6, 2014 at 11:43 PM, Suijian Zhou <suijian.zhou@gmail.com>wrote:
>>>
>>>> Hi,
>>>>   I tried to process only 2 of the input files, i.e, 2GB + 2GB input,
>>>> the program finished successfully in 6 minutes. But as I have 39 nodes,
>>>> they should be enough to load  and process the 8*2GB=16GB size graph? Can
>>>> somebody help to give some hints( Will all the nodes participate in graph
>>>> loading from HDFS or only master node load the graph?)? Thanks!
>>>>
>>>>   Best Regards,
>>>>   Suijian
>>>>
>>>>
>>>>
>>>> 2014-03-06 16:24 GMT-06:00 Suijian Zhou <suijian.zhou@gmail.com>:
>>>>
>>>> Hi, Experts,
>>>>>   I'm trying to process a graph by pagerank in giraph, but the program
>>>>> always stucks there.
>>>>> There are 8 input files, each one is with size ~2GB and all copied
>>>>> onto HDFS. I use 39 nodes and each node has 16GB Mem and 8 cores. It
keeps
>>>>> printing the same info(as the following) on the screen after 2 hours,
looks
>>>>> no progress at all. What are the possible reasons? Testing small example
>>>>> files run without problems. Thanks!
>>>>>
>>>>> 14/03/06 16:17:42 INFO job.JobProgressTracker: Data from 39 workers -
>>>>> Compute superstep 0: 5854829 out of 49200000 vertices computed; 181 out
of
>>>>> 1521 partitions computed
>>>>> 14/03/06 16:17:47 INFO job.JobProgressTracker: Data from 39 workers -
>>>>> Compute superstep 0: 5854829 out of 49200000 vertices computed; 181 out
of
>>>>> 1521 partitions computed
>>>>>
>>>>>   Best Regards,
>>>>>   Suijian
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>>    Claudio Martella
>>>
>>>
>>
>>
>
>
> --
>    Claudio Martella
>
>

Mime
View raw message