giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jyotirmoy Sundi <sundi...@gmail.com>
Subject Re: how to use out of core options
Date Thu, 17 Oct 2013 21:32:08 GMT
apart from these you might also want to check permissions of the dir path
where offloading of vertices and messages happen.
Ideally giraph is not meant for out-of-core if you graph is much bigger
then the cluster can handle in memory, using giraph defeats the purpose in
this case.



On Thu, Oct 17, 2013 at 8:13 AM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:

> Thanks very much, so are you saying if I use Dgiraph.maxPartitionsInMemory
> and Dgiraph.maxMessagesInMemory to make them both smaller number, then it
> might work?
>
> Thanks again,
> Jian
>
>
> On Thu, Oct 17, 2013 at 12:56 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>
>> You need to tune it per your cluster. This is what mentioned in the docs:
>> *"It is difficult to decide a general policy to use out-of-core
>> capabilities*, as it depends on the behavior of the algorithm and the
>> input graph. The exact number of partitions and messages to keep in memory
>> depends on the cluster capabilities, the number of messages produced per
>> superstep, and number of active vertices per superstep. Moreover, it
>> depends on the type and size of vertex values and messages. For example,
>> algorithms such as Belief Propagation tend to keep large vertex values,
>> while algorithms such as clique computations tend to send large messages
>> along. Hence, it depends on your algorithm what feature to rely on more."
>>
>> Thanks
>>  Sundi
>>
>>
>> On Wed, Oct 16, 2013 at 9:41 PM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:
>>
>>> Hi Sundi,
>>>
>>> I just tried your method, but somehow the job failed, the attached is
>>> the history of the job. and it was good without the outofcore options. Do
>>> you have any clue why is that?
>>>
>>> The command I used to run the program is below:
>>>
>>> $HADOOP_HOME/bin/hadoop jar
>>> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>>> org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>> -Dgiraph.useOutOfCoreGraph=true
>>> org.apache.giraph.examples.SimplePageRankComputation -vif
>>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>>> -vip /user/andy/input/tiny_graph.txt -vof
>>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>>> /user/andy/output/page3 -w 3 -mc
>>> org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute
>>>
>>> Many thanks,
>>>
>>> Jianqiang
>>>
>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <oujianqiangooy@gmail.com
>>> > wrote:
>>>
>>>> got it, thank you very much!
>>>>
>>>>
>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>>>
>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>> like
>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>
>>>>>> Hi I have a question about the out of core giraph. It is said that,
>>>>>> in order to use disk to store the partions, we need to use "
>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>> statement to?
>>>>>>
>>>>>> BTW, I am just trying to use the pagerank or shortestpath example
to
>>>>>> test the out of core performance of my cluster.
>>>>>>
>>>>>> Thanks very much,
>>>>>> Jian
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Jyotirmoy Sundi
>>>>> Data Engineer,
>>>>> Admobius
>>>>>
>>>>> San Francisco, CA 94158
>>>>>
>>>>
>>>>
>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <oujianqiangooy@gmail.com
>>> > wrote:
>>>
>>>> got it, thank you very much!
>>>>
>>>>
>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>>>
>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>> like
>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>
>>>>>> Hi I have a question about the out of core giraph. It is said that,
>>>>>> in order to use disk to store the partions, we need to use "
>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>> statement to?
>>>>>>
>>>>>> BTW, I am just trying to use the pagerank or shortestpath example
to
>>>>>> test the out of core performance of my cluster.
>>>>>>
>>>>>> Thanks very much,
>>>>>> Jian
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best Regards,
>>>>> Jyotirmoy Sundi
>>>>> Data Engineer,
>>>>> Admobius
>>>>>
>>>>> San Francisco, CA 94158
>>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Jyotirmoy Sundi
>> Data Engineer,
>> Admobius
>>
>> San Francisco, CA 94158
>>
>
>


-- 
Best Regards,
Jyotirmoy Sundi
Data Engineer,
Admobius

San Francisco, CA 94158

Mime
View raw message