hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From daemeon reiydelle <daeme...@gmail.com>
Subject Re: Hadoop Learning Environment
Date Tue, 04 Nov 2014 22:32:20 GMT
What you want as a sandbox depends on what you are trying to learn.

If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all
of the suggestions (perhaps excluding BigTop due to its setup complexities)
are great. Laptop? perhaps but laptop's are really kind of infuriatingly
slow (because of the hardware - you pay a price for a 30-45watt average
heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7
with lots of memory. What do you think of the thought that you will pretty
quickly graduate to wanting a small'ish desktop for your sandbox?

A simple, single node, Hadoop instance will let you learn many things. The
next level of complexity comes when you are attempting to deal with data
whose processing needs to be split up, so you can learn about how to split
data in Mapping, reduce the splits via reduce jobs, etc. For that, you
could get a windows desktop box or e.g. RedHat/CentOS and use
virtualization. Something like a 4 core i5 with 32gb of memory, running 3
or for some things 4, vm's. You could load e.g. hortonworks into each of
the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives
off of eBay and you can have a lot of learning.











*.......“The race is not to the swift,nor the battle to the strong,but to
those who can see it coming and jump aside.” - Hunter ThompsonDaemeon*
On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano <osumano@gmail.com> wrote:

> you can try the pivotal vm as well.
>
> http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html
>
> On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov <lfedotov@hortonworks.com>
> wrote:
>
>> Tim,
>> download Sandbox from http://hortonworks/com
>> You will have everything needed in a small VM instance which will run on
>> your home desktop.
>>
>>
>> *Thank you!*
>>
>>
>> *Sincerely,*
>>
>> *Leonid Fedotov*
>>
>> Systems Architect - Professional Services
>>
>> lfedotov@hortonworks.com
>>
>> office: +1 855 846 7866 ext 292
>>
>> mobile: +1 650 430 1673
>>
>> On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy <bluethundr@gmail.com> wrote:
>>
>>> Hey all,
>>>
>>>  I want to setup an environment where I can teach myself hadoop. Usually
>>> the way I'll handle this is to grab a machine off the Amazon free tier and
>>> setup whatever software I want.
>>>
>>> However I realize that Hadoop is a memory intensive, big data solution.
>>> So what I'm wondering is, would a t2.micro instance be sufficient for
>>> setting up a cluster of hadoop nodes with the intention of learning it? To
>>> keep things running longer in the free tier I would either setup however
>>> many nodes as I want and keep them stopped when I'm not actively using
>>> them. Or just setup a few nodes with a few different accounts (with a
>>> different gmail address for each one.. easy enough to do).
>>>
>>> Failing that, what are some other free/cheap solutions for setting up a
>>> hadoop learning environment?
>>>
>>> Thanks,
>>> Tim
>>>
>>> --
>>> GPG me!!
>>>
>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>

Mime
View raw message