giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jianqiang Ou <oujianqiang...@gmail.com>
Subject Re: how to use out of core options
Date Fri, 18 Oct 2013 07:03:18 GMT
Thanks, I just tried another dataset, which could be successfully handled
by my cluster within memory. However, exceptions still occurred with the
-Dgiraph.useOutOfCoreGraph=true option, but it works fine with only
-Dgiraph.useOutOfCoreMessages=true
option, so do you still think it is the dir permission issue?

By the way, the dir path you mentioned should be the dir to store the
outofcore partion and messages in local file system, right? But how do I
know where it is? It should be determined by Giraph instead of the
applications, right?

Thanks for your time and patience again,
Jian


On Thu, Oct 17, 2013 at 5:32 PM, Jyotirmoy Sundi <sundi133@gmail.com> wrote:

> apart from these you might also want to check permissions of the dir path
> where offloading of vertices and messages happen.
> Ideally giraph is not meant for out-of-core if you graph is much bigger
> then the cluster can handle in memory, using giraph defeats the purpose in
> this case.
>
>
>
> On Thu, Oct 17, 2013 at 8:13 AM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:
>
>> Thanks very much, so are you saying if I use Dgiraph.maxPartitionsInMemory
>> and Dgiraph.maxMessagesInMemory to make them both smaller number, then
>> it might work?
>>
>> Thanks again,
>> Jian
>>
>>
>> On Thu, Oct 17, 2013 at 12:56 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>
>>> You need to tune it per your cluster. This is what mentioned in the docs:
>>> *"It is difficult to decide a general policy to use out-of-core
>>> capabilities*, as it depends on the behavior of the algorithm and the
>>> input graph. The exact number of partitions and messages to keep in memory
>>> depends on the cluster capabilities, the number of messages produced per
>>> superstep, and number of active vertices per superstep. Moreover, it
>>> depends on the type and size of vertex values and messages. For example,
>>> algorithms such as Belief Propagation tend to keep large vertex values,
>>> while algorithms such as clique computations tend to send large messages
>>> along. Hence, it depends on your algorithm what feature to rely on more."
>>>
>>> Thanks
>>>  Sundi
>>>
>>>
>>> On Wed, Oct 16, 2013 at 9:41 PM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:
>>>
>>>> Hi Sundi,
>>>>
>>>> I just tried your method, but somehow the job failed, the attached is
>>>> the history of the job. and it was good without the outofcore options. Do
>>>> you have any clue why is that?
>>>>
>>>> The command I used to run the program is below:
>>>>
>>>> $HADOOP_HOME/bin/hadoop jar
>>>> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>>>> org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>> -Dgiraph.useOutOfCoreGraph=true
>>>> org.apache.giraph.examples.SimplePageRankComputation -vif
>>>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>>>> -vip /user/andy/input/tiny_graph.txt -vof
>>>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>>>> /user/andy/output/page3 -w 3 -mc
>>>> org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute
>>>>
>>>> Many thanks,
>>>>
>>>> Jianqiang
>>>>
>>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <
>>>> oujianqiangooy@gmail.com> wrote:
>>>>
>>>>> got it, thank you very much!
>>>>>
>>>>>
>>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>>>>
>>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>>> like
>>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>>
>>>>>>> Hi I have a question about the out of core giraph. It is said
that,
>>>>>>> in order to use disk to store the partions, we need to use "
>>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>>> statement to?
>>>>>>>
>>>>>>> BTW, I am just trying to use the pagerank or shortestpath example
to
>>>>>>> test the out of core performance of my cluster.
>>>>>>>
>>>>>>> Thanks very much,
>>>>>>> Jian
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Jyotirmoy Sundi
>>>>>> Data Engineer,
>>>>>> Admobius
>>>>>>
>>>>>> San Francisco, CA 94158
>>>>>>
>>>>>
>>>>>
>>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <
>>>> oujianqiangooy@gmail.com> wrote:
>>>>
>>>>> got it, thank you very much!
>>>>>
>>>>>
>>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>>>>
>>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>>> like
>>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>>
>>>>>>> Hi I have a question about the out of core giraph. It is said
that,
>>>>>>> in order to use disk to store the partions, we need to use "
>>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>>> statement to?
>>>>>>>
>>>>>>> BTW, I am just trying to use the pagerank or shortestpath example
to
>>>>>>> test the out of core performance of my cluster.
>>>>>>>
>>>>>>> Thanks very much,
>>>>>>> Jian
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best Regards,
>>>>>> Jyotirmoy Sundi
>>>>>> Data Engineer,
>>>>>> Admobius
>>>>>>
>>>>>> San Francisco, CA 94158
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Best Regards,
>>> Jyotirmoy Sundi
>>> Data Engineer,
>>> Admobius
>>>
>>> San Francisco, CA 94158
>>>
>>
>>
>
>
> --
> Best Regards,
> Jyotirmoy Sundi
> Data Engineer,
> Admobius
>
> San Francisco, CA 94158
>

Mime
View raw message