giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Claudio Martella <claudio.marte...@gmail.com>
Subject Re: how to use out of core options
Date Sat, 19 Oct 2013 15:21:37 GMT
looking at your logs, there's a null pointer exception. looks like a bug to
me. what version are you running? what command are you using to run the job?


On Fri, Oct 18, 2013 at 9:03 AM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:

> Thanks, I just tried another dataset, which could be successfully handled
> by my cluster within memory. However, exceptions still occurred with the
> -Dgiraph.useOutOfCoreGraph=true option, but it works fine with only  -Dgiraph.useOutOfCoreMessages=true
> option, so do you still think it is the dir permission issue?
>
> By the way, the dir path you mentioned should be the dir to store the
> outofcore partion and messages in local file system, right? But how do I
> know where it is? It should be determined by Giraph instead of the
> applications, right?
>
> Thanks for your time and patience again,
> Jian
>
>
> On Thu, Oct 17, 2013 at 5:32 PM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>
>> apart from these you might also want to check permissions of the dir path
>> where offloading of vertices and messages happen.
>> Ideally giraph is not meant for out-of-core if you graph is much bigger
>> then the cluster can handle in memory, using giraph defeats the purpose in
>> this case.
>>
>>
>>
>> On Thu, Oct 17, 2013 at 8:13 AM, Jianqiang Ou <oujianqiangooy@gmail.com>wrote:
>>
>>> Thanks very much, so are you saying if I use Dgiraph.maxPartitionsInMemory
>>> and Dgiraph.maxMessagesInMemory to make them both smaller number, then
>>> it might work?
>>>
>>> Thanks again,
>>> Jian
>>>
>>>
>>> On Thu, Oct 17, 2013 at 12:56 AM, Jyotirmoy Sundi <sundi133@gmail.com>wrote:
>>>
>>>> You need to tune it per your cluster. This is what mentioned in the
>>>> docs:
>>>> *"It is difficult to decide a general policy to use out-of-core
>>>> capabilities*, as it depends on the behavior of the algorithm and the
>>>> input graph. The exact number of partitions and messages to keep in memory
>>>> depends on the cluster capabilities, the number of messages produced per
>>>> superstep, and number of active vertices per superstep. Moreover, it
>>>> depends on the type and size of vertex values and messages. For example,
>>>> algorithms such as Belief Propagation tend to keep large vertex values,
>>>> while algorithms such as clique computations tend to send large messages
>>>> along. Hence, it depends on your algorithm what feature to rely on more."
>>>>
>>>> Thanks
>>>>  Sundi
>>>>
>>>>
>>>> On Wed, Oct 16, 2013 at 9:41 PM, Jianqiang Ou <oujianqiangooy@gmail.com
>>>> > wrote:
>>>>
>>>>> Hi Sundi,
>>>>>
>>>>> I just tried your method, but somehow the job failed, the attached is
>>>>> the history of the job. and it was good without the outofcore options.
Do
>>>>> you have any clue why is that?
>>>>>
>>>>> The command I used to run the program is below:
>>>>>
>>>>> $HADOOP_HOME/bin/hadoop jar
>>>>> $GIRAPH_HOME/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-
>>>>> 0.20.203.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner
>>>>> -Dgiraph.useOutOfCoreMessages=true -Dgiraph.useOutOfCoreGraph=true
>>>>> org.apache.giraph.examples.SimplePageRankComputation -vif
>>>>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>>>>> -vip /user/andy/input/tiny_graph.txt -vof
>>>>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>>>>> /user/andy/output/page3 -w 3 -mc
>>>>> org.apache.giraph.examples.SimplePageRankComputation\$SimplePageRankMasterCompute
>>>>>
>>>>> Many thanks,
>>>>>
>>>>> Jianqiang
>>>>>
>>>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <
>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>
>>>>>> got it, thank you very much!
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>>>> like
>>>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi I have a question about the out of core giraph. It is
said that,
>>>>>>>> in order to use disk to store the partions, we need to use
"
>>>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>>>> statement to?
>>>>>>>>
>>>>>>>> BTW, I am just trying to use the pagerank or shortestpath
example
>>>>>>>> to test the out of core performance of my cluster.
>>>>>>>>
>>>>>>>> Thanks very much,
>>>>>>>> Jian
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Jyotirmoy Sundi
>>>>>>> Data Engineer,
>>>>>>> Admobius
>>>>>>>
>>>>>>> San Francisco, CA 94158
>>>>>>>
>>>>>>
>>>>>>
>>>>> On Wed, Oct 16, 2013 at 12:11 PM, Jianqiang Ou <
>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>
>>>>>> got it, thank you very much!
>>>>>>
>>>>>>
>>>>>> On Wed, Oct 16, 2013 at 10:43 AM, Jyotirmoy Sundi <sundi133@gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>> Put it as -Dgiraph.useOutOfCoreMessages=true
>>>>>>> -Dgiraph.useOutOfCoreGraph=true  after GiraphRuuner
>>>>>>> like
>>>>>>> hadoop jar girap.jar org.apache.giraph.GiraphRunner -Dgiraph.useOutOfCoreMessages=true
>>>>>>> -Dgiraph.useOutOfCoreGraph=true ...
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Oct 16, 2013 at 7:29 AM, Jianqiang Ou <
>>>>>>> oujianqiangooy@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi I have a question about the out of core giraph. It is
said that,
>>>>>>>> in order to use disk to store the partions, we need to use
"
>>>>>>>> giraph.useOutOfCoreGraph=true", but where should I put this
>>>>>>>> statement to?
>>>>>>>>
>>>>>>>> BTW, I am just trying to use the pagerank or shortestpath
example
>>>>>>>> to test the out of core performance of my cluster.
>>>>>>>>
>>>>>>>> Thanks very much,
>>>>>>>> Jian
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Best Regards,
>>>>>>> Jyotirmoy Sundi
>>>>>>> Data Engineer,
>>>>>>> Admobius
>>>>>>>
>>>>>>> San Francisco, CA 94158
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Jyotirmoy Sundi
>>>> Data Engineer,
>>>> Admobius
>>>>
>>>> San Francisco, CA 94158
>>>>
>>>
>>>
>>
>>
>> --
>> Best Regards,
>> Jyotirmoy Sundi
>> Data Engineer,
>> Admobius
>>
>> San Francisco, CA 94158
>>
>
>


-- 
   Claudio Martella
   claudio.martella@gmail.com

Mime
View raw message