giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sebastian Schelter <ssc.o...@googlemail.com>
Subject Re: Trying to implement program to find betweenness centrality in giraph
Date Fri, 08 Feb 2013 12:25:48 GMT
How large is the graph for which you are trying to compute betweeness
centrality?

On 08.02.2013 13:22, Claudio Martella wrote:
> Unfortunately there is no way to disable the counter limit completely.
> Counters are very expensive as they require the jobtracker to keep a lot of
> information for the whole duration of the job (it is the jobtracker that is
> aggregating the counters from each tasktracker).
> Fortunately, many real-world graphs out there tend to have a "small"
> diameter. I think you're going to hit a wall only on real random graphs and
> with many random sources.
> 
> The second option you have, is to run one job per source, hitting the setup
> and input and output supersteps cost overhead.
> 
> 
> On Fri, Feb 8, 2013 at 3:35 AM, pradeep kumar <pradeep0802@gmail.com> wrote:
> 
>> Hi Dionysis,
>>
>> i was getting those error's because of the configuration issues (even
>> after reading all the input values correctly it was not able to start
>> compute for superstep 0).. it was because i was running program in wrong
>> way..
>>
>> (you can check till where your program is running by putting some print
>> st's and checking o/p in syslog)
>>
>> following cmd worked for me
>>
>> hadoop jar giraph-0.2-SNAPSHOT-for-hadoop-0.20.2-jar-with-dependencies.jar
>> org.apache.giraph.examples.bc_random
>> org.apache.giraph.examples.bc_randomVertex -_GMInputFormat ADJ
>> org.apache.giraph.examples.bc_randomVertexInputFormat -_GMOutputFormat ADJ
>> org.apache.giraph.examples.bc_randomVertexOutputFormat --K 5 -i inputbet -o
>> out -w 1
>>
>>
>> @Claudio : Sorry actually my prog was exceeding counter limit which got
>> resolved by increasing the the counter limit in mapred (gave 1024 i kw its
>> never gona hit even close to that :-)) as you suggested,
>>
>> But its was just for small sample input (more of like linear graph), i am
>> worried about actual graph which is huge and i might not even know the longest
>> shortest path in graph (randomly changing) so..
>>
>> was just asking is there a way i can cancel checking counter limit check..
>>
>>
>>
>> On Fri, Feb 8, 2013 at 2:51 AM, Claudio Martella <
>> claudio.martella@gmail.com> wrote:
>>
>>> Sorry pradeep, I'm not sure I understand your problem. What about the 361
>>> supersteps and the counters limit change?
>>> Did you solve it?
>>>
>>>
>>> On Thu, Feb 7, 2013 at 5:43 PM, pradeep kumar <pradeep0802@gmail.com>wrote:
>>>
>>>> Hi Claudio
>>>>    actually i did increased limit on my laptop to 1024 (pseudo mode) it
>>>> took 361 supersteps for 15 nodes and 18 edges (depth 10+), we will soon be
>>>> testing this on cluster of 5 or 6 hope everything goes well. Is there a way
>>>> i can disable this check..? i remember someone posting this earlier..!! but
>>>> not sure how..;-(
>>>>
>>>>  @Jan, yes,  actually even i want to see results,..:-)
>>>>
>>>> PS thanks for all help..
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Feb 7, 2013 at 9:47 PM, Jan van der Lugt <janlugt@gmail.com>wrote:
>>>>
>>>>> I never actually ran bc_random, so I don't know how many supersteps it
>>>>> needs for different input sizes. If you could try this out, it would
be
>>>>> worthwhile to know. Especially the relation between input size / number
of
>>>>> supersteps. It's a limitation of the Pregel paradigm that you need quite
a
>>>>> few superstep to do BFS, not a lot we can do about this...
>>>>>
>>>>>
>>>>> On Thu, Feb 7, 2013 at 3:47 PM, Claudio Martella <
>>>>> claudio.martella@gmail.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> you have to increase the counters limit in your cluster. You can
>>>>>> increase the limit by setting the property mapreduce.job.counters.limit
in
>>>>>> your mapred-site.xml accordingly.
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 7, 2013 at 4:34 PM, pradeep kumar <pradeep0802@gmail.com>wrote:
>>>>>>
>>>>>>> Hi Jan,
>>>>>>>
>>>>>>> well actually i just corrected my mistake.. i was running it
wrong
>>>>>>> way, thanks..
>>>>>>> just 1 more question on map-reduce counters
>>>>>>>
>>>>>>> giraph by default limits the superstep counters to 120
>>>>>>> but in bc_random master superstep itself exceeds this, so i had
to
>>>>>>> increase the limit..this was just for 20 nodes sample i/p i was
trying
>>>>>>>
>>>>>>> my real data will be close to million nodes,  so how can i manage
for
>>>>>>> such large input..
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 7, 2013 at 4:02 PM, pradeep kumar <pradeep0802@gmail.com>wrote:
>>>>>>>
>>>>>>>> Hi lugt,
>>>>>>>>
>>>>>>>> Thanks for all your help.. program compiled successfully
into jar..
>>>>>>>> but actually i am not able to run programs..
>>>>>>>> i tried following cmd to run
>>>>>>>>
>>>>>>>> 1) using GiraphRunner
>>>>>>>> hadoop jar
>>>>>>>> giraph-0.2-SNAPSHOT-for-hadoop-0.20.2-jar-with-dependencies.jar
>>>>>>>> org.apache.giraph.GiraphRunner org.apache.giraph.examples.bc_randomVertex
>>>>>>>>  -vif org.apache.giraph.examples.bc_randomVertexInputFormat
-vip inputbet
>>>>>>>>  -of org.apache.giraph.examples.bc_randomVertexOutputFormat
-op outputbet
>>>>>>>> -w 1
>>>>>>>>
>>>>>>>> it does maps every thing successfully but then gives following
error
>>>>>>>> in log
>>>>>>>>
>>>>>>>> java.lang.IllegalStateException: run: Caught an unrecoverable
exception waitFor: ExecutionException occurred while waiting for org.apache.giraph.utils.ProgressableUtils$FutureWaitable@40fd6cd8
>>>>>>>> 	at org.apache.giraph.graph.GraphMapper.run(GraphMapper.java:102)
>>>>>>>> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
>>>>>>>> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
>>>>>>>> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:266)
>>>>>>>> 	at java.security.AccessController.doPrivileged(Native Method)
>>>>>>>> 	at javax.security.auth.Subject.doAs(Subject.java:415)
>>>>>>>>
>>>>>>>> followed by some other errors
>>>>>>>>
>>>>>>>> 2) bc_random
>>>>>>>>
>>>>>>>> i actually couldnt understand all arg types here
>>>>>>>>
>>>>>>>> used :
>>>>>>>>
>>>>>>>> hadoop jar giraph-0.2-SNAPSHOT-for-hadoop-0.20.2-jar-with-dependencies.jar
org.apache.giraph.examples.bc_random org.apache.giraph.examples.bc_randomVertex --GMInputFormat
ADJ  org.apache.giraph.examples.bc_randomVertexInputFormat -i inputbet  --GMOutputFormat 
ADJ org.apache.giraph.examples.bc_randomVertexOutputFormat -o outputbet -w 1
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> but gives error for k procedure arg (not sure what to give
as value here, even about about ip format i mentioned above  )
>>>>>>>>
>>>>>>>>
>>>>>>>> I have uploaded sample input in hdfs in following format
>>>>>>>>
>>>>>>>> 10 4500 9 800 11 1000 12 1000
>>>>>>>> 11 5500 12 1100 11 800
>>>>>>>>
>>>>>>>> nid val <eid val>
>>>>>>>>
>>>>>>>> did i made a mistake anywhere..?
>>>>>>>>
>>>>>>>> any suggestion will be a great help..!!
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 6, 2013 at 5:03 PM, Jan van der Lugt <janlugt@gmail.com>wrote:
>>>>>>>>
>>>>>>>>> Hi Pradeep,
>>>>>>>>>
>>>>>>>>> You can check out the Green-Marl compiler from my Github
(
>>>>>>>>> https://github.com/janlugt/Green-Marl), that one should
be
>>>>>>>>> compatible with the latest version of Giraph. Please
let me know if
>>>>>>>>> anything doesn't work for you. The changes will probably
also be accepted
>>>>>>>>> upstream later this day.
>>>>>>>>>
>>>>>>>>> - Jan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Feb 6, 2013 at 9:48 AM, Jan van der Lugt <janlugt@gmail.com
>>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> It should work with version 0.2, but let me check.
Since the API
>>>>>>>>>> is changing all the time at the moment, it's very
different to track all
>>>>>>>>>> those changes. Probably they are numerous instances
of the same error.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Feb 6, 2013 at 9:26 AM, pradeep kumar <
>>>>>>>>>> pradeep0802@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Lugt,
>>>>>>>>>>>
>>>>>>>>>>> Thanks for suggestion, i just tried with Green-Marl's
bc_random,
>>>>>>>>>>> But because of so many changes in Giraph codebase,
imports and
>>>>>>>>>>> many other functions in bc_random
>>>>>>>>>>> generate lots of errors almost 100+ errors..
I tried with Giraph
>>>>>>>>>>> 1.0 and current version of 2.0,
>>>>>>>>>>> most of errors i have resolved but still there
are many errors..
>>>>>>>>>>> Any suggestion on this..?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Pradeep Kumar
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Pradeep Kumar
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>>    Claudio Martella
>>>>>>    claudio.martella@gmail.com
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Pradeep Kumar
>>>>
>>>
>>>
>>>
>>> --
>>>    Claudio Martella
>>>    claudio.martella@gmail.com
>>>
>>
>>
>>
>> --
>> Pradeep Kumar
>>
> 
> 
> 


Mime
View raw message