Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B4220175D0 for ; Wed, 5 Nov 2014 18:56:21 +0000 (UTC) Received: (qmail 80239 invoked by uid 500); 5 Nov 2014 18:56:16 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 80133 invoked by uid 500); 5 Nov 2014 18:56:16 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 80122 invoked by uid 99); 5 Nov 2014 18:56:15 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 18:56:15 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of hanmao_shi@apple.com designates 17.151.62.28 as permitted sender) Received: from [17.151.62.28] (HELO mail-in6.apple.com) (17.151.62.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 18:56:11 +0000 DKIM-Signature: v=1; a=rsa-sha256; d=apple.com; s=mailout2048s; c=relaxed/simple; q=dns/txt; i=@apple.com; t=1415213720; x=2279127320; h=From:Sender:Reply-To:Subject:Date:Message-id:To:Cc:MIME-version:Content-type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-reply-to:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=qgPHLi3rzJ3e3fk/BKTDu4Y+GhmUwJPO6q8kcJzlj4U=; b=VVoxCKP2AFrNIN+bpMfnztv1awwl96WxQXs3VEX4riPYtMlw0A4EIK4hIj1vwpwh DqOChXOLuAqfpG3Ke51HVNFGAEcWbVuBICzgeMUs7OimOGF+shcW9Ov46/TRVfaC K6BH1RsSsfNb9Rc50qVj4ILtS0nVe6oyns8C51dDKwRDumYFV30SQh8Ckwyv5ZP3 fifnJZN1RY1RMs8U+jDs6UGURWLdw5YLqJ3QkfAPI7LMtPGYox8F3Bg4aCznO5kl bA0FPUAsODZGSEo0b15TiMacIdp3156n9FSSWac0I4sCc0FKdOVsa2t10hCxxt8L Vj/U7jvytfXRfFT0UDeC5w==; Received: from relay7.apple.com (relay7.apple.com [17.128.113.101]) by mail-in6.apple.com (Apple Secure Mail Relay) with SMTP id 6F.ED.05330.8927A545; Wed, 5 Nov 2014 10:55:20 -0800 (PST) X-AuditID: 11973e15-f791b6d0000014d2-51-545a7298eb4a Received: from fenugreek.apple.com (fenugreek.apple.com [17.128.115.97]) (using TLS with cipher RC4-MD5 (128/128 bits)) (Client did not present a certificate) by relay7.apple.com (Apple SCV relay) with SMTP id E1.B2.23239.E727A545; Wed, 5 Nov 2014 10:54:54 -0800 (PST) Received: from [17.149.224.231] (unknown [17.149.224.231]) by fenugreek.apple.com (Oracle Communications Messaging Server 7.0.5.30.0 64bit (built Oct 22 2013)) with ESMTPSA id <0NEK00H5EXW6VB10@fenugreek.apple.com> for user@hadoop.apache.org; Wed, 05 Nov 2014 10:55:20 -0800 (PST) From: Jim Shi Content-type: multipart/alternative; boundary="Apple-Mail=_2E21A433-2AF9-4052-AED9-F210FB9ACCE2" Message-id: <1EB2A1E4-8DAC-46E8-B5D1-DEF67212B237@apple.com> MIME-version: 1.0 (Mac OS X Mail 7.2 \(1874\)) Subject: Re: Hadoop Learning Environment Date: Wed, 05 Nov 2014 10:55:18 -0800 References: To: user@hadoop.apache.org In-reply-to: X-Mailer: Apple Mail (2.1874) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrILMWRmVeSWpSXmKPExsUi2FCYqjujKCrEYPsHEYueKdNYHBg9JnRt YQxgjOKySUnNySxLLdK3S+DK+Hizi7XgwyzGiiV725kaGL+0MHYxcnJICJhINGxqZYawxSQu 3FvP1sXIxSEksI9R4uCnVawwRQvXv2WBSExmknh6+CZYt5DAAiaJBXd1QWw2ARWJCX17weLM AkkSjesug9m8AjYSm358ZIKw9STOnP3FDmILC2hIzF17BSzOIqAqsXLCfFaIBe+YJO69fw7W LCIgJdH9ZjJYEadAsMSZK7vZIC6SlXj0oQnsIgmBr6wSG/ddYJ7AKDgLyfJZSBZCxLUlli18 zQxhG0g87XzFiimuL/Hm3RymBYxsqxiFchMzc3Qz88z0EgsKclL1kvNzNzGCwny6negOxjOr rA4xCnAwKvHwduZHhgixJpYVV+YeYpTmYFES5+VIjwoREkhPLEnNTk0tSC2KLyrNSS0+xMjE wSnVwLjzhpEk59KUuZlCP2/aCGxyYTAu8Va2yZtmrPojuS4iJHrrb5b61WdSDL4G7lpyhqWT I+AAd1z5pZkya1ym7Sh+xX7oWcXkC8Hbv5/S8J3ocELo3jPLdWt1tSJun9y7J1ttrpdz06Fg r6Mme6d0v+xlqTv1YiLbj+drdCKu6219zH3fbE4tZ4USS3FGoqEWc1FxIgAmPmOgVAIAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrCLMWRmVeSWpSXmKPExsUi2FCcqFtXFBVisLBT3qJnyjQWB0aPCV1b GAMYo7hsUlJzMstSi/TtErgyPt7sYi34MIuxYsnedqYGxi8tjF2MnBwSAiYSC9e/ZYGwxSQu 3FvP1sXIxSEkMJlJ4unhm2BFQgILmCQW3NUFsdkEVCQm9O0FizMLJEk0rrsMZvMK2Ehs+vGR CcLWkzhz9hc7iC0soCExd+0VsDiLgKrEygnzWSEWvGOSuPf+OViziICURPebyWBFnALBEmeu 7GaDuEhW4tGHJpYJjHyzkOybhWQHRFxbYtnC18wQtoHE085XrJji+hJv3s1hWsDItopRoCg1 J7HSXC+xoCAnVS85P3cTIzgsC1N3MDYutzrEKMDBqMTD25EfGSLEmlhWXJl7iFGCg1lJhLet MCpEiDclsbIqtSg/vqg0J7X4EKM0B4uSOK9OLlC1QHpiSWp2ampBahFMlomDU6qB0WRGc+gE Z+5HP2/Uirq7X539cM/2vbIOihnvC//WRHssNjrmVHgnL/vSh8yPzO7zy7oMq+uUl+7etGDh horMO/sC1HgTLl88tO3oz7RXBfuWnnHpCmk9t/Gk3c/ZO6IVwpu0vsxYKH6yO2//EXUtUc+3 Ud/3hq34s1ZwvlhfY+JL3vOf5/9lS1BiKc5INNRiLipOBAAw4nYpRwIAAA== X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_2E21A433-2AF9-4052-AED9-F210FB9ACCE2 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=windows-1252 Hi, Yay, I followed the steps you described and got the following error. Any idea? vagrant up creating provisioner directive for running tests Bringing machine 'bigtop1' up with 'virtualbox' provider... =3D=3D> bigtop1: Box 'puppetlab-centos-64-nocm' could not be found. = Attempting to find and install... bigtop1: Box Provider: virtualbox bigtop1: Box Version: >=3D 0 =3D=3D> bigtop1: Adding box 'puppetlab-centos-64-nocm' (v0) for = provider: virtualbox bigtop1: Downloading: = http://puppet-vagrant-boxes.puppetlabs.com/centos-64-x64-vbox4210-nocm.box= =3D=3D> bigtop1: Successfully added box 'puppetlab-centos-64-nocm' (v0) = for 'virtualbox'! There are errors in the configuration of this machine. Please fix the following errors and try again: vm: * The 'hostmanager' provisioner could not be found. Thanks Jim On Nov 4, 2014, at 6:36 PM, jay vyas = wrote: > Hi daemon: Actually, for most folks who would want to actually use a = hadoop cluster, i would think setting up bigtop is super easy ! If you = have issues with it ping me and I can help you get started. > Also, we have docker containers - so you dont even *need* a VM to run = a 4 or 5 node hadoop cluster. >=20 > install vagrant > install VirtualBox > git clone https://github.com/apache/bigtop > cd bigtop/bigtop-deploy/vm/vagrant-puppet > vagrant up > Then vagrant destroy when your done. >=20 > This to me is easier than manually downloading an appliance, picking = memory > starting the virtualbox gui, loading the appliance , etc... and also = its easy to turn the simple single node bigtop VM into a multinode one,=20= > by just modifying the vagrantile.=20 >=20 >=20 > On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle = wrote: > What you want as a sandbox depends on what you are trying to learn.=20 >=20 > If you are trying to learn to code in e.g PigLatin, Sqooz, or similar, = all of the suggestions (perhaps excluding BigTop due to its setup = complexities) are great. Laptop? perhaps but laptop's are really kind of = infuriatingly slow (because of the hardware - you pay a price for a = 30-45watt average heating bill). A laptop is an OK place to start if it = is e.g. an i5 or i7 with lots of memory. What do you think of the = thought that you will pretty quickly graduate to wanting a small'ish = desktop for your sandbox? >=20 > A simple, single node, Hadoop instance will let you learn many things. = The next level of complexity comes when you are attempting to deal with = data whose processing needs to be split up, so you can learn about how = to split data in Mapping, reduce the splits via reduce jobs, etc. For = that, you could get a windows desktop box or e.g. RedHat/CentOS and use = virtualization. Something like a 4 core i5 with 32gb of memory, running = 3 or for some things 4, vm's. You could load e.g. hortonworks into each = of the vm's and practice setting up a 3/4 way cluster. Throw in 2-3 1tb = drives off of eBay and you can have a lot of learning.=20 >=20 >=20 >=20 >=20 >=20 > ....... > =93The race is not to the swift, > nor the battle to the strong, > but to those who can see it coming and jump aside.=94 - Hunter = Thompson > Daemeon >=20 > On Tue, Nov 4, 2014 at 1:24 PM, oscar sumano = wrote: > you can try the pivotal vm as well.=20 >=20 > = http://pivotalhd.docs.pivotal.io/tutorial/getting-started/pivotalhd-vm.htm= l >=20 > On Tue, Nov 4, 2014 at 3:13 PM, Leonid Fedotov = wrote: > Tim, > download Sandbox from http://hortonworks/com > You will have everything needed in a small VM instance which will run = on your home desktop. >=20 >=20 > Thank you! >=20 >=20 >=20 > Sincerely, >=20 > Leonid Fedotov >=20 > Systems Architect - Professional Services >=20 > lfedotov@hortonworks.com >=20 > office: +1 855 846 7866 ext 292 >=20 > mobile: +1 650 430 1673 >=20 >=20 > On Tue, Nov 4, 2014 at 11:28 AM, Tim Dunphy = wrote: > Hey all, >=20 > I want to setup an environment where I can teach myself hadoop. = Usually the way I'll handle this is to grab a machine off the Amazon = free tier and setup whatever software I want.=20 >=20 > However I realize that Hadoop is a memory intensive, big data = solution. So what I'm wondering is, would a t2.micro instance be = sufficient for setting up a cluster of hadoop nodes with the intention = of learning it? To keep things running longer in the free tier I would = either setup however many nodes as I want and keep them stopped when I'm = not actively using them. Or just setup a few nodes with a few different = accounts (with a different gmail address for each one.. easy enough to = do). >=20 > Failing that, what are some other free/cheap solutions for setting up = a hadoop learning environment? >=20 > Thanks, > Tim >=20 > --=20 > GPG me!! >=20 > gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B >=20 >=20 >=20 > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or = entity to which it is addressed and may contain information that is = confidential, privileged and exempt from disclosure under applicable = law. If the reader of this message is not the intended recipient, you = are hereby notified that any printing, copying, dissemination, = distribution, disclosure or forwarding of this communication is strictly = prohibited. If you have received this communication in error, please = contact the sender immediately and delete it from your system. Thank = You. >=20 >=20 >=20 >=20 >=20 > --=20 > jay vyas --Apple-Mail=_2E21A433-2AF9-4052-AED9-F210FB9ACCE2 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=windows-1252 Hi, = Yay,
   I followed the steps you described and got the = following error.
Any idea?

  vagrant up
creating = provisioner directive for running tests
Bringing machine 'bigtop1' up with = 'virtualbox' provider...
=3D=3D> bigtop1: Box = 'puppetlab-centos-64-nocm' could not be found. Attempting to find and = install...
    bigtop1: Box Provider: = virtualbox
    bigtop1: Box Version: >=3D 0
=3D=3D> = bigtop1: Adding box 'puppetlab-centos-64-nocm' (v0) for provider: = virtualbox
=3D=3D> bigtop1: = Successfully added box 'puppetlab-centos-64-nocm' (v0) for = 'virtualbox'!
There are errors in the = configuration of this machine. Please fix
the = following errors and try again:

vm:
* The 'hostmanager' = provisioner could not be found.

Thanks
Jim





On Nov 4, 2014, at 6:36 PM, jay vyas <jayunit100.apache@gmail.com> wrote:

starting the virtualbox gui, loading the appliance , = etc...  and also its easy to turn the simple single node bigtop VM = into a multinode one,
by just modifying the vagrantile. =


On Tue, Nov 4, 2014 at 5:32 PM, daemeon reiydelle = <daemeonr@gmail.com> = wrote:
What = you want as a sandbox depends on what you are trying to learn. =

If you are trying to learn to code in e.g PigLatin, Sqooz, or = similar, all of the suggestions (perhaps excluding BigTop due to its = setup complexities) are great. Laptop? perhaps but laptop's are really = kind of infuriatingly slow (because of the hardware - you pay a price = for a 30-45watt average heating bill). A laptop is an OK place to start = if it is e.g. an i5 or i7 with lots of memory. What do you think of the = thought that you will pretty quickly graduate to wanting a small'ish = desktop for your sandbox?

A simple, single node, Hadoop = instance will let you learn many things. The next level of complexity = comes when you are attempting to deal with data whose processing needs = to be split up, so you can learn about how to split data in Mapping, = reduce the splits via reduce jobs, etc. For that, you could get a = windows desktop box or e.g. RedHat/CentOS and use virtualization. = Something like a 4 core i5 with 32gb of memory, running 3 or for some = things 4, vm's. You could load e.g. hortonworks into each of the vm's = and practice setting up a 3/4 way cluster. Throw in 2-3 1tb drives off = of eBay and you can have a lot of learning.




.......
=93The = race is not to the swift,
nor the battle to the strong,
but to = those who can see it coming and jump aside.=94 - Hunter Thompson
Daemeon

On Tue, = Nov 4, 2014 at 1:24 PM, oscar sumano <osumano@gmail.com> wrote:

On Tue, Nov 4, 2014 = at 3:13 PM, Leonid Fedotov <lfedotov@hortonworks.com> = wrote:
Tim,
download Sandbox from http://hortonworks/com
You will have = everything needed in a small VM instance which will run on your home = desktop.


Thank = you!


Sincerely,

Leonid = Fedotov

Systems Architect - Professional = Services

lfedotov@hortonworks.com

office: +1 855 846 7866 ext 292

mobile: +1 650 430 1673


On Tue, Nov 4, 2014 at 11:28 AM, Tim = Dunphy <bluethundr@gmail.com> = wrote:
Hey = all,

 = I want to setup an environment where I can teach myself hadoop. Usually = the way I'll handle this is to grab a machine off the Amazon free tier = and setup whatever software I want. 

Howeve= r I realize that Hadoop is a memory intensive, big data solution. So = what I'm wondering is, would a t2.micro instance be sufficient for = setting up a cluster of hadoop nodes with the intention of learning it? = To keep things running longer in the free tier I would either setup = however many nodes as I want and keep them stopped when I'm not actively = using them. Or just setup a few nodes with a few different accounts = (with a different gmail address for each one.. easy enough to = do).

Failin= g that, what are some other free/cheap solutions for setting up a hadoop = learning environment?

Thanks= ,
Tim

--
GPG = me!!

gpg --keyserver pool.sks-keyservers.net --recv-keys = F186197B



CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or = entity to which it is addressed and may contain information that is = confidential, privileged and exempt from disclosure under applicable = law. If the reader of this message is not the intended recipient, you = are hereby notified that any printing, copying, dissemination, = distribution, disclosure or forwarding of this communication is strictly = prohibited. If you have received this communication in error, please = contact the sender immediately and delete it from your system. Thank = You.





--
jay vyas

= --Apple-Mail=_2E21A433-2AF9-4052-AED9-F210FB9ACCE2--