hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Shi <hanmao_...@apple.com>
Subject Re: Hadoop Learning Environment
Date Wed, 05 Nov 2014 18:55:18 GMT
Hi, Yay,
   I followed the steps you described and got the following error.
Any idea?

  vagrant up
creating provisioner directive for running tests
Bringing machine 'bigtop1' up with 'virtualbox' provider...
==> bigtop1: Box 'puppetlab-centos-64-nocm' could not be found. Attempting to find and
install...
    bigtop1: Box Provider: virtualbox
    bigtop1: Box Version: >= 0
==> bigtop1: Adding box 'puppetlab-centos-64-nocm' (v0) for provider: virtualbox
    bigtop1: Downloading: http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210-nocm.box
==> bigtop1: Successfully added box 'puppetlab-centos-64-nocm' (v0) for 'virtualbox'!
There are errors in the configuration of this machine. Please fix
the following errors and try again:

vm:
* The 'hostmanager' provisioner could not be found.

Thanks
Jim





On Nov 4, 2014, at 6:36 PM, jay vyas <jayunit100.apache@gmail.com> wrote:

> Hi daemon:  Actually, for most folks who would want to actually use a hadoop cluster,
 i would think setting up bigtop is super easy ! If you have issues with it ping me and I
can help you get started.
> Also, we have docker containers - so you dont even *need* a VM to run a 4 or 5 node hadoop
cluster.
> 
> install vagrant
> install VirtualBox
> git clone https://github.com/apache/bigtop
> cd bigtop/bigtop-deploy/vm/vagrant-puppet
> vagrant up
> Then vagrant destroy when your done.
> 
> This to me is easier than manually downloading an appliance, picking memory
> starting the virtualbox gui, loading the appliance , etc...  and also its easy to turn
the simple single node bigtop VM into a multinode one, 
> by just modifying the vagrantile. 
> 
> 
> On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle <daemeonr@gmail.com> wrote:
> What you want as a sandbox depends on what you are trying to learn. 
> 
> If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of the suggestions
(perhaps excluding BigTop due to its setup complexities) are great. Laptop? perhaps but laptop's
are really kind of infuriatingly slow (because of the hardware - you pay a price for a 30-45watt
average heating bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with lots
of memory. What do you think of the thought that you will pretty quickly graduate to wanting
a small'ish desktop for your sandbox?
> 
> A simple, single node, Hadoop instance will let you learn many things. The next level
of complexity comes when you are attempting to deal with data whose processing needs to be
split up, so you can learn about how to split data in Mapping, reduce the splits via reduce
jobs, etc. For that, you could get a windows desktop box or e.g. RedHat/CentOS and use virtualization.
Something like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm's. You
could load e.g. hortonworks into each of the vm's and practice setting up a 3/4 way cluster.
Throw in 2-3 1tb drives off of eBay and you can have a lot of learning. 
> 
> 
> 
> 
> 
> .......
> “The race is not to the swift,
> nor the battle to the strong,
> but to those who can see it coming and jump aside.” - Hunter Thompson
> Daemeon
> 
> On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano <osumano@gmail.com> wrote:
> you can try the pivotal vm as well. 
> 
> http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.html
> 
> On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov <lfedotov@hortonworks.com> wrote:
> Tim,
> download Sandbox from http://hortonworks/com
> You will have everything needed in a small VM instance which will run on your home desktop.
> 
> 
> Thank you!
> 
> 
> 
> Sincerely,
> 
> Leonid Fedotov
> 
> Systems Architect - Professional Services
> 
> lfedotov@hortonworks.com
> 
> office: +1 855 846 7866 ext 292
> 
> mobile: +1 650 430 1673
> 
> 
> On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy <bluethundr@gmail.com> wrote:
> Hey all,
> 
>  I want to setup an environment where I can teach myself hadoop. Usually the way I'll
handle this is to grab a machine off the Amazon free tier and setup whatever software I want.

> 
> However I realize that Hadoop is a memory intensive, big data solution. So what I'm wondering
is, would a t2.micro instance be sufficient for setting up a cluster of hadoop nodes with
the intention of learning it? To keep things running longer in the free tier I would either
setup however many nodes as I want and keep them stopped when I'm not actively using them.
Or just setup a few nodes with a few different accounts (with a different gmail address for
each one.. easy enough to do).
> 
> Failing that, what are some other free/cheap solutions for setting up a hadoop learning
environment?
> 
> Thanks,
> Tim
> 
> -- 
> GPG me!!
> 
> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
> 
> 
> 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to which it
is addressed and may contain information that is confidential, privileged and exempt from
disclosure under applicable law. If the reader of this message is not the intended recipient,
you are hereby notified that any printing, copying, dissemination, distribution, disclosure
or forwarding of this communication is strictly prohibited. If you have received this communication
in error, please contact the sender immediately and delete it from your system. Thank You.
> 
> 
> 
> 
> 
> -- 
> jay vyas


Mime
View raw message