giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gustavo Enrique Salazar Torres <gsala...@ime.usp.br>
Subject Re: Running on Amazon EMR?
Date Mon, 11 Nov 2013 15:05:45 GMT
Hi Rob:

I had to do all those steps you talked about, specially at bootstrap I run
a Bash script stored at s3 like this:

--core-key-value, giraph.zkList=localhost:2181, --mapred-key-value,
mapreduce.job.counters.limit=1200

Then at the steps configuration I start by setting up Giraph and Zookeeper
by calling two Bash scripts (two separate steps):

s3://elasticmapreduce/libs/script-runner/script-runner.jar
s3://mybucket/install_giraph.sh
s3://elasticmapreduce/libs/script-runner/script-runner.jar
s3://mybucket/install_zookeeper.sh

In the case of the install_giraph.sh I do this:

hadoop dfs -copyToLocal s3://mybucket/giraph.tar.gz /home/hadoop
tar -xzvf /home/hadoop/giraph.tar.gz -C /home/hadoop

and install_zookeeper.sh does this:

hadoop dfs -copyToLocal s3://data.clipesebandas/binaries/zookeeper.tar.gz
/home/hadoop
tar -xzvf /home/hadoop/zookeeper.tar.gz -C /home/hadoop
/home/hadoop/zookeeper/bin/zkServer.sh start

And finally I run my Giraph algorithm in another step like this:

/home/hadoop/giraph.jar org.giraph.MyGraphAlgorithm  /user/hadoop/input_graph,
/user/hadoop/built_graph  20 1

Perhaps some steps, like Zookeeper configuration, are not needed since this
configuration is based on Giraph 0.1.
Hope this helps.

Cheers
Gustavo



On Mon, Nov 11, 2013 at 12:43 PM, Rob Vesse <rvesse@dotnetrdf.org> wrote:

> Hi All
>
> I've been looking around for any documentation about running Giraph on
> Amazon Elastic Map Reduce (EMR) and didn't turn up anything particularly
> useful.
>
> It looks like the only real requirements to run on EMR are to add
> Bootstrap actions to the Job Flow configuration to apply the relevant
> Hadoop configuration settings e.g. increasing max map tasks.  After that it
> looks like I should just need to use a standard Custom JAR launch step to
> launch the Giraph Runner with appropriate arguments for my Giraph program.
>
> Before I start trying to do this and incurring EC2 costs does anyone have
> experience of running Giraph applications on EMR that they are willing to
> share?  Any suggestions, tips, common pitfalls etc I should be aware of?
>
> Cheers,
>
> Rob
>

Mime
View raw message