hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Aquilina <jaquil...@eagleeyet.net>
Subject Re: t2.micro on AWS; Is it enough for setting up Hadoop cluster ?
Date Sat, 07 Mar 2015 09:29:04 GMT

When i was testing I was using default setup 1 master node 2 core and no
task nodes. i would spiin up the cluster then terminate it. The term for
that is a transient cluster. 

When the big data was needing to be crunched i changed the setup a bit.
An Important note there is a limitation of 20 Nodes be it core or task
with EMR a request can be submitted to lift that limitation. 

When actually live i had 1 master node 3 task nodes (which have HDFS
storage) and 10 task nodes. All instances used were of size m3.large.
Ran another batch of data for 2013 through EMR with this setup in 31 min
just to run the data that isnt including cluster spawn up time. 

One thing to note you do not need to use HDFS storage as that can and
will drive up the cost quickly and there there is a chance of data
corruption or even data loss if a core node crashes. I have been using
amazon S3 and pulling the data from there. The biggest advantage is that
you can spawn up multiple clusters and share the same data to be
processed that way. Using HDFS has its perks too but costs can
drastically increase as well. 

Jonathan Aquilina
Founder Eagle Eye T

On 2015-03-07 09:54, tesmai4@gmail.com wrote: 

> Dear Jonathan,
> Would you please describe the process of running EMR based Hadoop for $15.00, I tried
and my cost were rocketing like $60 for one hour.
> Regards
> On 05/03/2015 23:57, Jonathan Aquilina wrote: 
> krish EMR wont cost you much with all the testing and data we ran through the test systems
as well as the large amont of data when everythign was read we paid about 15.00 USD. I honestly
do not think that the specs there would be enough as java can be pretty ram hungry. 
> ---
> Regards,
> Jonathan Aquilina
> Founder Eagle Eye T
> On 2015-03-06 00:41, Krish Donald wrote: 
> Hi, 
> I am new to AWS and would like to setup Hadoop cluster using cloudera manager for 6-7
> t2.micro on AWS; Is it enough for setting up Hadoop cluster ? 
> I would like to use free service as of now. 
> Please advise. 
> Thanks 
> Krish
View raw message