Here is my work log with some steps I need to prep for building Giraph:

Requires Maven 3.x

mvn -version


Install JDK 1.7

http://www.if-not-true-then-false.com/2010/install-sun-oracle-java-jdk-jre-7-on-fedora-centos-red-hat-rhel/


## java ##
sudo alternatives --install /usr/bin/java java /usr/java/jdk1.7.0_51/jre/bin/java 200000

## javaws ##
sudo alternatives --install /usr/bin/javaws javaws /usr/java/jdk1.7.0_51/jre/bin/javaws 200000

## Java Browser (Mozilla) Plugin 32-bit ##
sudo alternatives --install /usr/lib/mozilla/plugins/libjavaplugin.so libjavaplugin.so /usr/java/jdk1.7.0_51/jre/lib/i386/libnpjp2.so 200000

## Java Browser (Mozilla) Plugin 64-bit ##
sudo alternatives --install /usr/lib64/mozilla/plugins/libjavaplugin.so libjavaplugin.so.x86_64 /usr/java/jdk1.7.0_51/jre/lib/amd64/libnpjp2.so 200000

## Install javac only if you installed JDK (Java Development Kit) package ##
sudo alternatives --install /usr/bin/javac javac /usr/java/jdk1.7.0_51/bin/javac 200000
sudo alternatives --install /usr/bin/jar jar /usr/java/jdk1.7.0_51/bin/jar 200000


Check JDK

export JAVA_HOME="/usr/java/jdk1.7.0_51"

             java -version


Checkout sources

git clone https://git-wip-us.apache.org/repos/asf/giraph.git



Apply the last version of the unmerged DOCU - patch

wget https://issues.apache.org/jira/secure/attachment/12630040/GIRAPH-849.v3.patch

git apply --stat GIRAPH-849.v3.patch

git apply --check GIRAPH-849.v3.patch




Build Giraph

mvn -Phadoop_2 -fae -DskipTests clean install

mvn -Phadoop_2 -DskipTests -Ddependency.locations.enabled=false site

mvn -Phadoop_2 -DskipTests site:stage


Do some cool work on doc and code … ;-)


Grep for some code:

grep -r --include="*.java" WHAT WHERE



Create the patch and submit it to JIRA and to the Review Board

http://ariejan.net/2009/10/26/how-to-create-and-apply-a-patch-with-git/

git diff --no-prefix trunk > GIRAPH-{ISSUE_NUMBER}.patch




You can skip the yello parts ... and maybe you need another profile, but I just use hadoop_2 right now.

Good luck!
MK



On Sat, Mar 1, 2014 at 5:57 PM, Jyoti Yadav <rao.jyoti26yadav@gmail.com> wrote:
Hi Mirko..

Thanks for your reply.. All MapReduce programs are running fine on this system.
 And it  is yarn setup.

Please guide me how to bulid giraph with this hadoop version..Should I need to install external zookeeper also.?

Thanks in advance..

Jyoti


On Sat, Mar 1, 2014 at 6:31 PM, Mirko Kämpf <mirko.kaempf@cloudera.com> wrote:
Hello,

if you build Giraph for hadoop 0.20.... the same jars will not work for hadoop version 2.2.0.
Right now I build the profile -Phadoop_2 from curren the 1.1. branch in the git repo.

How many nodes (physical servers or VMs) do you run on your 64 core system?
What distro of Hadoop are working with? and is it a MRv1 or MRV2 (YARN) setup?

Is your MapReduce system working properly ... can you run TerraSort for example?

Cheers,
Mirko
 


On Sat, Mar 1, 2014 at 4:15 AM, Jyoti Yadav <rao.jyoti26yadav@gmail.com> wrote:
Anyone please reply ..Is it portability problem??.. Does giraph has any issues with Hadoop 2.2.0??

Do I need to build Giraph on the new system ??

Thanks



On Sat, Mar 1, 2014 at 2:28 PM, Jyoti Yadav <rao.jyoti26yadav@gmail.com> wrote:
Hi Sebastian..
Thanks for the links given  for big graphs..

Actually I want to tell you something about problem i am facing.

Initially I was working with hadoop 0.20.203 . I build Giraph there.. it was running fine.

Now  to test very big graph related problem and to compare the performance , I moved to new system which is  of 64 cores and 512 GB memory and  3 TB storage.  Instead to building Giraph in the new system, I just copied Giraph folder from my previous system to this new system. In this new system hadoop version 2.2..0  . I tried to execute SimpleSourceShortestPath algo on sample data set. It is throwing following exception.

I gave following command to execute the job.

hadoop jar /home/abcd2014/giraph/giraph-examples/target/giraph-examples-1.1.0-SNAPSHOT-for-hadoop-0.20.203.0-jar-with-dependencies.jar org.apache.giraph.GiraphRunner -Dgiraph.SplitMasterWorker=true  org.apache.giraph.examples.SimpleShortestPathsComputation -vif org.apache.giraph.io.formats.JsonLongDoubleFloatDoubleVertexInputFormat -vip /user/abcd2014/giraph_input/tiny_graph.txt -vof org.apache.giraph.io.formats.IdWithValueTextOutputFormat -op /user/abcd2014/output2/shortestpaths -w 1



14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge input format specified. Ensure your InputFormat does not require one.
14/03/01 12:44:46 INFO utils.ConfigurationUtils: No edge output format specified. Ensure your OutputFormat does not require one.
14/03/01 12:44:46 INFO Configuration.deprecation: mapreduce.job.counters.limit is deprecated. Instead, use mapreduce.job.counters.max
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.map.memory.mb is deprecated. Instead, use mapreduce.map.memory.mb
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.reduce.memory.mb is deprecated. Instead, use mapreduce.reduce.memory.mb
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative
14/03/01 12:44:46 INFO Configuration.deprecation: mapreduce.user.classpath.first is deprecated. Instead, use mapreduce.job.user.classpath.first
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.map.max.attempts is deprecated. Instead, use mapreduce.map.maxattempts
14/03/01 12:44:46 INFO job.GiraphJob: run: Since checkpointing is disabled (default), do not allow any task retries (setting mapred.map.max.attempts = 0, old value = 4)
14/03/01 12:44:46 INFO Configuration.deprecation: mapred.job.tracker is deprecated. Instead, use mapreduce.jobtracker.address
Exception in thread "main" java.lang.IllegalArgumentException: checkLocalJobRunnerConfiguration: When using LocalJobRunner, you cannot run in split master / worker mode since there is only 1 task at a time!
    at org.apache.giraph.job.GiraphJob.checkLocalJobRunnerConfiguration(GiraphJob.java:165)
    at org.apache.giraph.job.GiraphJob.run(GiraphJob.java:233)
    at org.apache.giraph.GiraphRunner.run(GiraphRunner.java:94)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
    at org.apache.giraph.GiraphRunner.main(GiraphRunner.java:124)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)



Would you suggest me something to fix this...If you need any details further,please let me know...

Thanks & Regards

Jyoti




On Sat, Mar 1, 2014 at 1:35 PM, Sebastian Schelter <ssc@apache.org> wrote:
Hi Jyoti,

You can find a couple of very large graphs in KONECT [1] and on the website of the laboratory for web algorithmics from the University of Milan [2]. You will probably have to convert them to an appropriate format for Giraph.

Best,
Sebastian

[1] http://konect.uni-koblenz.de/
[2] http://law.di.unimi.it/datasets.php


On 03/01/2014 05:22 AM, Jyoti Yadav wrote:
Hi folks..

I got new system which is  of 64 cores and 512 GB memory and  3 TB
storage.I want to test the performance of Giraph on this system.
  Would anyone provide me the link for very large graph  so that I can
execute Single Source Shortest Path Example. For this algo to run graph
should be weighted graph. and  to feed it into the Giraph -input format is
JsonLongDoubleFloatDouble

Thanks in advance...
With Regards

Jyoti







--
-- 
Mirko Kämpf

Trainer @ Cloudera

tel: +49 176 20 63 51 99
skype: kamir1604





--
-- 
Mirko Kämpf

Trainer @ Cloudera

tel: +49 176 20 63 51 99
skype: kamir1604
mirko@cloudera.com