Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E6AF217E98 for ; Wed, 5 Nov 2014 03:44:17 +0000 (UTC) Received: (qmail 78867 invoked by uid 500); 5 Nov 2014 03:44:13 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 78771 invoked by uid 500); 5 Nov 2014 03:44:13 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 78761 invoked by uid 99); 5 Nov 2014 03:44:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 03:44:12 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yue.yuanyuan@gmail.com designates 209.85.215.44 as permitted sender) Received: from [209.85.215.44] (HELO mail-la0-f44.google.com) (209.85.215.44) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 03:44:08 +0000 Received: by mail-la0-f44.google.com with SMTP id gf13so2097534lab.3 for ; Tue, 04 Nov 2014 19:43:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=6UKeDV0vmhHpAjU4SwpnhOCqteskiZYqYSdP/VoxHsE=; b=m7Vt4aNN4UPRxUMdi3k+sYpj5vmjED0AfqOB6G8uiyBjUqRZbQnGoXSD168/L6Rh/6 FFCD8/8MJjuYKJrPVOwnFbe8kHsylDD4qMtyreXwkB4hFt4KS7xqkMHti8p8jOB7OuHE pd1jcKqGvre7ij8+1OR2Y/G2g+ZYDmSEuoUTiydkzmfH0oKgYplJ5VtLRJhajfWUk3dr qmYu14u0C3g8yWWCrNeRwcgCMl39//SbEnaQy9ySFrpeEv/VO2rYqVz9/5aqG28BoRrT t7QmCrJRrCjUccDCpy11fRKIvXBejZQn92SKycJUs74j9Hi/p4isuZqPy1oElFayOc3Q ulWQ== X-Received: by 10.112.254.162 with SMTP id aj2mr64945074lbd.70.1415159027509; Tue, 04 Nov 2014 19:43:47 -0800 (PST) MIME-Version: 1.0 Received: by 10.112.96.135 with HTTP; Tue, 4 Nov 2014 19:43:27 -0800 (PST) In-Reply-To: References: From: Gavin Yue Date: Tue, 4 Nov 2014 19:43:27 -0800 Message-ID: Subject: Re: Hadoop Learning Environment To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a113462be22b49b05071463ab X-Virus-Checked: Checked by ClamAV on apache.org --001a113462be22b49b05071463ab Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Try docker! http://ferry.opencore.io/en/latest/examples/hadoop.html On Tue, Nov 4, 2014 at 6:36 PM, jay vyas wrote: > Hi daemon: Actually, for most folks who would want to actually use a > hadoop cluster, i would think setting up bigtop is super easy ! If you > have issues with it ping me and I can help you get started. > Also, we have docker containers - so you dont even *need* a VM to run a 4 > or 5 node hadoop cluster. > > install vagrant > install VirtualBox > git clone https://github.com/apache/bigtop > cd bigtop/bigtop-deploy/vm/vagrant-puppet > vagrant up > Then vagrant destroy when your done. > > This to me is easier than manually downloading an appliance, picking memo= ry > starting the virtualbox gui, loading the appliance , etc... and also its > easy to turn the simple single node bigtop VM into a multinode one, > by just modifying the vagrantile. > > > On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle > wrote: > >> What you want as a sandbox depends on what you are trying to learn. >> >> If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, >> all of the suggestions (perhaps excluding BigTop due to its setup >> complexities) are great. Laptop? perhaps but laptop's are really kind of >> infuriatingly slow (because of the hardware - you pay a price for a >> 30-45watt average heating bill). A laptop is an OK place to start if it = is >> e.g. an i5 or i7 with lots of memory. What do you think of the thought t= hat >> you will pretty quickly graduate to wanting a small'ish desktop for your >> sandbox? >> >> A simple, single node, Hadoop instance will let you learn many things. >> The next level of complexity comes when you are attempting to deal with >> data whose processing needs to be split up, so you can learn about how t= o >> split data in Mapping, reduce the splits via reduce jobs, etc. For that, >> you could get a windows desktop box or e.g. RedHat/CentOS and use >> virtualization. Something like a 4 core i5 with 32gb of memory, running = 3 >> or for some things 4, vm's. You could load e.g. hortonworks into each of >> the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb dri= ves >> off of eBay and you can have a lot of learning. >> >> >> >> >> >> >> >> >> >> >> >> *.......=E2=80=9CThe race is not to the swift,nor the battle to the stro= ng,but to >> those who can see it coming and jump aside.=E2=80=9D - Hunter ThompsonDa= emeon* >> On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano wrote: >> >>> you can try the pivotal vm as well. >>> >>> >>> http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.= html >>> >>> On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov >> > wrote: >>> >>>> Tim, >>>> download Sandbox from http://hortonworks/com >>>> You will have everything needed in a small VM instance which will run >>>> on your home desktop. >>>> >>>> >>>> *Thank you!* >>>> >>>> >>>> *Sincerely,* >>>> >>>> *Leonid Fedotov* >>>> >>>> Systems Architect - Professional Services >>>> >>>> lfedotov@hortonworks.com >>>> >>>> office: +1 855 846 7866 ext 292 >>>> >>>> mobile: +1 650 430 1673 >>>> >>>> On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy >>>> wrote: >>>> >>>>> Hey all, >>>>> >>>>> I want to setup an environment where I can teach myself hadoop. >>>>> Usually the way I'll handle this is to grab a machine off the Amazon = free >>>>> tier and setup whatever software I want. >>>>> >>>>> However I realize that Hadoop is a memory intensive, big data >>>>> solution. So what I'm wondering is, would a t2.micro instance be suff= icient >>>>> for setting up a cluster of hadoop nodes with the intention of learni= ng it? >>>>> To keep things running longer in the free tier I would either setup h= owever >>>>> many nodes as I want and keep them stopped when I'm not actively usin= g >>>>> them. Or just setup a few nodes with a few different accounts (with a >>>>> different gmail address for each one.. easy enough to do). >>>>> >>>>> Failing that, what are some other free/cheap solutions for setting up >>>>> a hadoop learning environment? >>>>> >>>>> Thanks, >>>>> Tim >>>>> >>>>> -- >>>>> GPG me!! >>>>> >>>>> gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B >>>>> >>>>> >>>> >>>> CONFIDENTIALITY NOTICE >>>> NOTICE: This message is intended for the use of the individual or >>>> entity to which it is addressed and may contain information that is >>>> confidential, privileged and exempt from disclosure under applicable l= aw. >>>> If the reader of this message is not the intended recipient, you are h= ereby >>>> notified that any printing, copying, dissemination, distribution, >>>> disclosure or forwarding of this communication is strictly prohibited.= If >>>> you have received this communication in error, please contact the send= er >>>> immediately and delete it from your system. Thank You. >>> >>> >>> >> > > > -- > jay vyas > --001a113462be22b49b05071463ab Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

On Tue, Nov 4, 2014 at 6:36 PM, jay vyas &l= t;jayunit1= 00.apache@gmail.com> wrote:
Hi daemon:=C2=A0 Actually, = for most folks who would want to actually use a hadoop cluster,=C2=A0 i wou= ld think setting up bigtop is super easy ! If you have issues with it ping = me and I can help you get started.
Also, we have docker containers - so = you dont even *need* a VM to run a 4 or 5 node hadoop cluster.

insta= ll vagrant
install VirtualBox
git clone https://github.com/apache/bigtop
cd bigtop/bigtop-deploy/vm/vagrant-puppet
vagrant up
Then vagrant destroy when your done.

This to me is easi= er than manually downloading an appliance, picking memory
starting= the virtualbox gui, loading the appliance , etc...=C2=A0 and also its easy= to turn the simple single node bigtop VM into a multinode one,
by just= modifying the vagrantile.


On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle <= daemeonr@gmail.com<= /a>> wrote:
What y= ou want as a sandbox depends on what you are trying to learn.

If yo= u are trying to learn to code in e.g PigLatin, Sqooz, or similar, all of th= e suggestions (perhaps excluding BigTop due to its setup complexities) are = great. Laptop? perhaps but laptop's are really kind of infuriatingly sl= ow (because of the hardware - you pay a price for a 30-45watt average heati= ng bill). A laptop is an OK place to start if it is e.g. an i5 or i7 with l= ots of memory. What do you think of the thought that you will pretty quickl= y graduate to wanting a small'ish desktop for your sandbox?

A sim= ple, single node, Hadoop instance will let you learn many things. The next = level of complexity comes when you are attempting to deal with data whose p= rocessing needs to be split up, so you can learn about how to split data in= Mapping, reduce the splits via reduce jobs, etc. For that, you could get a= windows desktop box or e.g. RedHat/CentOS and use virtualization. Somethin= g like a 4 core i5 with 32gb of memory, running 3 or for some things 4, vm&= #39;s. You could load e.g. hortonworks into each of the vm's and practi= ce setting up a 3/4 way cluster. Throw in 2-3 1tb drives off of eBay and yo= u can have a lot of learning.





.......=
=E2=80=9CThe race is not to the swift,
nor the battle to the strong,=
but to those who can see it coming and jump aside.=E2=80=9D - Hunter Thompson
Daemeon

=
On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano = <osumano@gmail.co= m> wrote:
<= br>
On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedoto= v <lfedotov@hortonworks.com> wrote:
Tim,
download Sandbox from http://hortonworks/com
You will have everything needed in a small VM instance which will r= un on your home desktop.


Thank you!


Sincerely,

Leonid Fedotov

Systems Architect - Professional Services

lfed= otov@hortonworks.com

office: +1 855 846 7866 ext 292

mobile: +1 650 430 1673


On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy = <bluethundr@gmail.com> wrote:
Hey all,

=C2=A0I want to setup an enviro= nment where I can teach myself hadoop. Usually the way I'll handle this= is to grab a machine off the Amazon free tier and setup whatever software = I want.=C2=A0

However I realize that Hadoop is a memory intensiv= e, big data solution. So what I'm wondering is, would a t2.micro instan= ce be sufficient for setting up a cluster of hadoop nodes with the intentio= n of learning it? To keep things running longer in the free tier I would ei= ther setup however many nodes as I want and keep them stopped when I'm = not actively using them. Or just setup a few nodes with a few different acc= ounts (with a different gmail address for each one.. easy enough to do).

Failing that, what are some other free/cheap solutions for setting= up a hadoop learning environment?

Thanks,
Tim

--
GPG me!!

gpg --keyse= rver pool.sks-= keyservers.net --recv-keys F186197B



CONFIDENTIALITY NOTICE
NOTICE: This message is = intended for the use of the individual or entity to which it is addressed a= nd may contain information that is confidential, privileged and exempt from= disclosure under applicable law. If the reader of this message is not the = intended recipient, you are hereby notified that any printing, copying, dis= semination, distribution, disclosure or forwarding of this communication is= strictly prohibited. If you have received this communication in error, ple= ase contact the sender immediately and delete it from your system. Thank Yo= u.





--
jay vyas

--001a113462be22b49b05071463ab--