hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Luis Larroque <larroques...@gmail.com>
Subject Using 2.4 hadoop version - Thinking in a future deployment in AWS (EC2)
Date Sat, 28 Mar 2015 22:19:02 GMT
Hi people, i'm new on this user list, and i need help!

I'm starting with hadoop recently, and i'm trying to use it with giraph.
So, for building giraph, i choose the 2.4 release of hadoop (because
according to this link
<http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-hadoop-version.html>,
it's the latest version suported in AWS, please tell me it this it's not ok
for some reason). I'm still doing local test, and i'm not even close to a
full deploy in AWS, but i'm trying to do my stuff and tests working towards
that objective.

But, when i was building giraph with Maven (i'm avoiding the use of YARN),
i hit the problem detected in this bug
<https://issues.apache.org/jira/browse/HADOOP-10547>. So, i don't know
which is the best option to choose from here, i hope that someone can help
me here:
- Choose the 2.2 version and try to build giraph with it (it's the most
advanced version, previous to 2.4, available).
- Download the source of hadoop 2.4, fix HADOOP-10547 (because it appears
that the 2.4 version of hadoop for AWS doesn't have this fix
<http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EnvironmentConfig_AMIHadoopPatches.html>),
build it, and after, try to rebuild giraph. I only be buliding hadoop by
myself for fixing that bug, i don't have plans for keep modifying it after
that. The downside of this option, it's that i'm not sure if it's possible
do that again in AWS (maybe with bootstraps actions?)

Bye!
Jose

Mime
View raw message