hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From José Luis Larroque <larroques...@gmail.com>
Subject Re: Use of hadoop in AWS - Build it from scratch on a EC2 instance / MapR hadoop distribution / Amazon hadoop distribution
Date Mon, 19 Oct 2015 00:10:30 GMT
Thanks for your answer Anders.

-The amount of data that i'm going to manipulate it's like the wikipedia (i
will use a dump)
- I already have the basics of hadoop (i hope), i have a local multinode
cluster setup and i already executed some algorithms.
- Because the amount of data its important, i believe that i should use
several nodes.

Maybe another option to considerate should be that i'm running Giraph on
top of the selected hadoop distribution/EC2.

Bye!
Jose

2015-10-18 18:53 GMT-03:00 Anders Nielsen <anders.shinde.nielsen@gmail.com>:

> Dear Jose,
>
> It will help people answer your question if you specify your goals :
>
> -If you do it to learn how to USE a running Hadoop then go for one of the
> prebuilt distributions (Amazon or MapR)
> -If you do it to learn more about the setting up and administrating Hadoop
> then you are better off setting everything up from scratch on EC2.
> -Do you need to run on many nodes or just a 1 node to test some Mapreduce
> scripts on a small data set?
>
> Regards,
>
> Anders
>
>
>
>
> On Sun, Oct 18, 2015 at 10:03 PM, José Luis Larroque <
> larroquester@gmail.com> wrote:
>
>> Hi all !
>>
>> I started to use hadoop with aws, and a big question appears in front of
>> me!
>>
>> I'm using a MapR distribution, for hadoop 2.4.0 in AWS. I already tried
>> some trivial examples, and before moving forward i have one question.
>>
>> What is the better option for using Hadoop on AWS?
>> - Build it from scratch on a EC2 instance
>> - Use MapR distribution of Hadoop
>> - Use Amazon distribution of Hadoop
>>
>> Sorry if my question is too broad.
>>
>> Bye!
>> Jose
>>
>>
>>
>>
>>
>

Mime
View raw message