Hi Mirko..

Thanks for your reply.. All MapReduce programs are running fine on this system.
 And it  is yarn setup.

Please guide me how to bulid giraph with this hadoop version..Should I need to install external zookeeper also.?

Thanks in advance..


On Sat, Mar 1, 2014 at 6:31 PM, Mirko Kämpf <mirko.kaempf@cloudera.com> wrote:

if you build Giraph for hadoop 0.20.... the same jars will not work for hadoop version 2.2.0.
Right now I build the profile -Phadoop_2 from curren the 1.1. branch in the git repo.

How many nodes (physical servers or VMs) do you run on your 64 core system?
What distro of Hadoop are working with? and is it a MRv1 or MRV2 (YARN) setup?

Is your MapReduce system working properly ... can you run TerraSort for example?


On Sat, Mar 1, 2014 at 4:15 AM, Jyoti Yadav <rao.jyoti26yadav@gmail.com> wrote:
Anyone please reply ..Is it portability problem??.. Does giraph has any issues with Hadoop 2.2.0??

Do I need to build Giraph on the new system ??


On Sat, Mar 1, 2014 at 2:28 PM, Jyoti Yadav <rao.jyoti26yadav@gmail.com> wrote:
Hi Sebastian..
Thanks for the links given  for big graphs..

Actually I want to tell you something about problem i am facing.

Initially I was working with hadoop 0.20.203 . I build Giraph there.. it was running fine.

Now  to test very big graph related problem and to compare the performance , I moved to new system which is  of 64 cores and 512 GB memory and  3 TB storage.  Instead to building Giraph in the new system, I just copied Giraph folder from my previous system to this new system. In this new system hadoop version 2.2..0  . I tried to execute SimpleSourceShortestPath algo on sample data set. It is throwing following exception.

I gave following command to execute the job.

hadoop jar /home/abcd2014/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop- org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=true  org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/abcd2014/giraph_input/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/abcd2014/output2/shortestpaths -w 1

14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one.
14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one.
14/03/01 12:44:46 INFO Configuration.deprecation: mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/03/01 12:44:46 INFO Configuration.deprecation: mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
14/03/01 12:44:46 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time!
    at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:165)
    at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:233)
    at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

Would you suggest me something to fix this...If you need any details further,please let me know...

Thanks & Regards


On Sat, Mar 1, 2014 at 1:35 PM, Sebastian Schelter <ssc@apache.org> wrote:
Hi Jyoti,

You can find a couple of very large graphs in KONECT [1] and on the website of the laboratory for web algorithmics from the University of Milan [2]. You will probably have to convert them to an appropriate format for Giraph.


[1] http://konect.uni-koblenz.de/
[2] http://law.di.unimi.it/datasets.php

On 03/01/2014 05:22 AM, Jyoti Yadav wrote:
Hi folks..

I got new system which is  of 64 cores and 512 GB memory and  3 TB
storage.I want to test the performance of Giraph on this system.
  Would anyone provide me the link for very large graph  so that I can
execute Single Source Shortest Path Example. For this algo to run graph
should be weighted graph. and  to feed it into the Giraph -input format is

Thanks in advance...
With Regards


Mirko Kämpf

Trainer @ Cloudera

tel: +49 176 20 63 51 99
skype: kamir1604