hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jay vyas <jayunit100.apa...@gmail.com>
Subject Re: Hadoop Learning Environment
Date Wed, 05 Nov 2014 02:36:32 GMT
Hi daemon:  Actually, for most folks who would want to actually use a
hadoop cluster,  i would think setting up bigtop is super easy ! If you
have issues with it ping me and I can help you get started.
Also, we have docker containers - so you dont even *need* a VM to run a 4
or 5 node hadoop cluster.

install vagrant
install VirtualBox
git clone https://github.com/apache/bigtop
cd bigtop/bigtop-deploy/vm/vagrant-puppet
vagrant up
Then vagrant destroy when your done.

This to me is easier than manually downloading an appliance, picking memory
starting the virtualbox gui, loading the appliance , etc...  and also its
easy to turn the simple single node bigtop VM into a multinode one,
by just modifying the vagrantile.


On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle <daemeonr@gmail.com>
wrote:

> What you want as a sandbox depends on what you are trying to learn.
>
> If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all
> of the suggestions (perhaps excluding BigTop due to its setup complexities)
> are great. Laptop? perhaps but laptop's are really kind of infuriatingly
> slow (because of the hardware - you pay a price for a 30-45watt average
> heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7
> with lots of memory. What do you think of the thought that you will pretty
> quickly graduate to wanting a small'ish desktop for your sandbox?
>
> A simple, single node, Hadoop instance will let you learn many things. The
> next level of complexity comes when you are attempting to deal with data
> whose processing needs to be split up, so you can learn about how to split
> data in Mapping, reduce the splits via reduce jobs, etc. For that, you
> could get a windows desktop box or e.g. RedHat/CentOS and use
> virtualization. Something like a 4 core i5 with 32gb of memory, running 3
> or for some things 4, vm's. You could load e.g. hortonworks into each of
> the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives
> off of eBay and you can have a lot of learning.
>
>
>
>
>
>
>
>
>
>
>
> *.......“The race is not to the swift,nor the battle to the strong,but to
> those who can see it coming and jump aside.” - Hunter ThompsonDaemeon*
> On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano <osumano@gmail.com> wrote:
>
>> you can try the pivotal vm as well.
>>
>>
>> http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html
>>
>> On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov <lfedotov@hortonworks.com>
>> wrote:
>>
>>> Tim,
>>> download Sandbox from http://hortonworks/com
>>> You will have everything needed in a small VM instance which will run on
>>> your home desktop.
>>>
>>>
>>> *Thank you!*
>>>
>>>
>>> *Sincerely,*
>>>
>>> *Leonid Fedotov*
>>>
>>> Systems Architect - Professional Services
>>>
>>> lfedotov@hortonworks.com
>>>
>>> office: +1 855 846 7866 ext 292
>>>
>>> mobile: +1 650 430 1673
>>>
>>> On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy <bluethundr@gmail.com>
>>> wrote:
>>>
>>>> Hey all,
>>>>
>>>>  I want to setup an environment where I can teach myself hadoop.
>>>> Usually the way I'll handle this is to grab a machine off the Amazon free
>>>> tier and setup whatever software I want.
>>>>
>>>> However I realize that Hadoop is a memory intensive, big data solution.
>>>> So what I'm wondering is, would a t2.micro instance be sufficient for
>>>> setting up a cluster of hadoop nodes with the intention of learning it? To
>>>> keep things running longer in the free tier I would either setup however
>>>> many nodes as I want and keep them stopped when I'm not actively using
>>>> them. Or just setup a few nodes with a few different accounts (with a
>>>> different gmail address for each one.. easy enough to do).
>>>>
>>>> Failing that, what are some other free/cheap solutions for setting up a
>>>> hadoop learning environment?
>>>>
>>>> Thanks,
>>>> Tim
>>>>
>>>> --
>>>> GPG me!!
>>>>
>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
>>>>
>>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>
>>
>>
>


-- 
jay vyas

Mime
View raw message