giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mirko Kämpf <mirko.kae...@cloudera.com>
Subject Re: Sample data for Single Source shortest path
Date Sat, 01 Mar 2014 13:01:03 GMT
Hello,

if you build Giraph for hadoop 0.20.... the same jars will not work for
hadoop version 2.2.0.
Right now I build the profile -Phadoop_2 from curren the 1.1. branch in the
git repo.

How many nodes (physical servers or VMs) do you run on your 64 core system?
What distro of Hadoop are working with? and is it a MRv1 or MRV2 (YARN)
setup?

Is your MapReduce system working properly ... can you run TerraSort for
example?

Cheers,
Mirko



On Sat, Mar 1, 2014 at 4:15 AM, Jyoti Yadav <rao.jyoti26yadav@gmail.com>wrote:

> Anyone please reply ..Is it portability problem??.. Does giraph has any
> issues with Hadoop 2.2.0??
>
> Do I need to build Giraph on the new system ??
>
> Thanks
>
>
>
> On Sat, Mar 1, 2014 at 2:28 PM, Jyoti Yadav <rao.jyoti26yadav@gmail.com>wrote:
>
>> Hi Sebastian..
>> Thanks for the links given  for big graphs..
>>
>> Actually I want to tell you something about problem i am facing.
>>
>> Initially I was working with *hadoop 0.20.203* . I build Giraph there..
>> it was running fine.
>>
>> Now  to test very big graph related problem and to compare the
>> performance , I moved to new system which is  of 64 cores and 512 GB memory
>> and  3 TB storage.  Instead to building Giraph in the new system, I just
>> copied Giraph folder from my previous system to this new system. In this
>> new system *hadoop version 2.2..0 * . I tried to execute
>> SimpleSourceShortestPath algo on sample data set. It is throwing following
>> exception.
>>
>> I gave following command to execute the job.
>>
>> hadoop jar
>> /home/abcd2014/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar
>> org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=true
>> org.apache.giraph.examples.SimpleShortestPathsComputation -vif
>> org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat
>> -vip /user/abcd2014/giraph_input/tiny_graph.txt -vof
>> org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op
>> /user/abcd2014/output2/shortestpaths -w 1
>>
>>
>>
>> 14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge input format
>> specified. Ensure your InputFormat does not require one.
>> 14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge output format
>> specified. Ensure your OutputFormat does not require one.
>> 14/03/01 12:44:46 INFO Configuration.deprecation:
>> mapreduce.job.counters.limit is deprecated. Instead, use
>> mapreduce.job.counters.max
>> 14/03/01 12:44:46 INFO Configuration.deprecation:
>> mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
>> 14/03/01 12:44:46 INFO Configuration.deprecation:
>> mapred.job.reduce.memory.mb is deprecated. Instead, use
>> mapreduce.reduce.memory.mb
>> 14/03/01 12:44:46 INFO Configuration.deprecation:
>> mapred.map.tasks.speculative.execution is deprecated. Instead, use
>> mapreduce.map.speculative
>> 14/03/01 12:44:46 INFO Configuration.deprecation:
>> mapreduce.user.classpath.first is deprecated. Instead, use
>> mapreduce.job.user.classpath.first
>> 14/03/01 12:44:46 INFO Configuration.deprecation: mapred.map.max.attempts
>> is deprecated. Instead, use mapreduce.map.maxattempts
>> 14/03/01 12:44:46 INFO job.GiraphJob: run: Since checkpointing is
>> disabled (default), do not allow any task retries (setting
>> mapred.map.max.attempts = 0, old value = 4)
>> 14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.tracker is
>> deprecated. Instead, use mapreduce.jobtracker.address
>>
>> *Exception in thread "main" java.lang.IllegalArgumentException:
>> checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run
>> in split master / worker mode since there is only 1 task at a time! *
>> at
>> org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:165)
>>     at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:233)
>>     at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)
>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>>     at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>     at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>>     at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>>     at java.lang.reflect.Method.invoke(Method.java:606)
>>     at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
>>
>>
>>
>> Would you suggest me something to fix this...If you need any details
>> further,please let me know...
>>
>> Thanks & Regards
>>
>> Jyoti
>>
>>
>>
>>
>> On Sat, Mar 1, 2014 at 1:35 PM, Sebastian Schelter <ssc@apache.org>wrote:
>>
>>> Hi Jyoti,
>>>
>>> You can find a couple of very large graphs in KONECT [1] and on the
>>> website of the laboratory for web algorithmics from the University of Milan
>>> [2]. You will probably have to convert them to an appropriate format for
>>> Giraph.
>>>
>>> Best,
>>> Sebastian
>>>
>>> [1] http://konect.uni-koblenz.de/
>>> [2] http://law.di.unimi.it/datasets.php
>>>
>>>
>>> On 03/01/2014 05:22 AM, Jyoti Yadav wrote:
>>>
>>>> Hi folks..
>>>>
>>>> I got new system which is  of 64 cores and 512 GB memory and  3 TB
>>>> storage.I want to test the performance of Giraph on this system.
>>>>   Would anyone provide me the link for very large graph  so that I can
>>>> execute Single Source Shortest Path Example. For this algo to run graph
>>>> should be weighted graph. and  to feed it into the Giraph -input format
>>>> is
>>>> JsonLongDoubleFloatDouble
>>>>
>>>> Thanks in advance...
>>>> With Regards
>>>>
>>>> Jyoti
>>>>
>>>>
>>>
>>
>


-- 
-- 
Mirko Kämpf

*Trainer* @ Cloudera

tel: +49 *176 20 63 51 99*
skype: *kamir1604*
mirko@cloudera.com

Mime
View raw message