Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 64F1377DC for ; Mon, 15 Aug 2011 20:16:10 +0000 (UTC) Received: (qmail 51288 invoked by uid 500); 15 Aug 2011 20:16:07 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 51209 invoked by uid 500); 15 Aug 2011 20:16:06 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 51201 invoked by uid 99); 15 Aug 2011 20:16:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Aug 2011 20:16:06 +0000 X-ASF-Spam-Status: No, hits=-5.0 required=5.0 tests=RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of matthew.goeke@monsanto.com designates 164.144.240.26 as permitted sender) Received: from [164.144.240.26] (HELO gateway1.monsanto.com) (164.144.240.26) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Aug 2011 20:16:01 +0000 X-IronPort-AV: E=Sophos;i="4.67,375,1309755600"; d="scan'208";a="47179530" Received: from unknown (HELO NA1000EXR02.na.ds.monsanto.com) ([10.29.223.250]) by gateway1.monsanto.com with ESMTP; 15 Aug 2011 15:16:08 -0500 Received: from NA1000EXR01.na.ds.monsanto.com ([10.30.64.28]) by NA1000EXR02.na.ds.monsanto.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 15 Aug 2011 15:15:40 -0500 Received: from stlwexhubprd02.na.ds.monsanto.com ([10.30.51.35]) by NA1000EXR01.na.ds.monsanto.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 15 Aug 2011 15:15:40 -0500 Received: from stlwexchhubp01.na.ds.monsanto.com (10.30.58.178) by stlwexhubprd02.na.ds.monsanto.com (10.30.51.35) with Microsoft SMTP Server (TLS) id 14.1.255.0; Mon, 15 Aug 2011 15:15:39 -0500 Received: from stlwexmbxprd04.na.ds.monsanto.com ([169.254.7.20]) by stlwexchhubp01.na.ds.monsanto.com ([10.30.58.178]) with mapi id 14.01.0255.000; Mon, 15 Aug 2011 15:15:39 -0500 From: "GOEKE, MATTHEW (AG/1000)" To: "common-user@hadoop.apache.org" Subject: RE: hadoop cluster on VM's Thread-Topic: hadoop cluster on VM's Thread-Index: AQHMW3MkZrHP7WEWWkaXvEKZVlajp5UeO1bQgABuhAD//6ySQA== Date: Mon, 15 Aug 2011 20:15:38 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.30.3.245] Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 X-OriginalArrivalTime: 15 Aug 2011 20:15:40.0491 (UTC) FILETIME=[1A7BE9B0:01CC5B88] Content-Transfer-Encoding: quoted-printable I was referring to multiple VM's on a single machine (that you have in hous= e) for my previous comment and not EC2. FWIW, I would rather see a single h= eavy data node than to partition off a single box into multiple machines un= less you are trying to do more on that server than just Hadoop. Obviously e= very person / company has their own constraints but if this box is solely f= or Hadoop then don't partition it otherwise you will incur a decent loss in= possible map/reduce slots. Matt -----Original Message----- From: Liam Friel [mailto:liam.friel@gmail.com]=20 Sent: Monday, August 15, 2011 3:04 PM To: common-user@hadoop.apache.org Subject: Re: hadoop cluster on VM's On Mon, Aug 15, 2011 at 7:31 PM, GOEKE, MATTHEW (AG/1000) < matthew.goeke@monsanto.com> wrote: > Is this just for testing purposes or are you planning on going into > production with this? If it is the latter than I would STRONGLY advise to > not give that a second thought due to how the framework handles I/O. Howe= ver > if you are just trying to test out distributed daemon setup and get some = ops > documentation then have at it :) > > Matt > > -----Original Message----- > From: Travis Camechis [mailto:camechis@gmail.com] > Sent: Monday, August 15, 2011 12:45 PM > To: common-user@hadoop.apache.org > Subject: hadoop cluster on VM's > > Is it recommended to install a hadoop cluster on a set of VM's that are a= ll > connected to a SAN? > > Could you expand on that? Do you mean multiple VMs on a single server are a no-no? Or do you mean running Hadoop on something like Amazon EC2 for production is also a no-no? With some pointers to background if the latter please ... Just for my education. I have run some (test I guess you could call them) Hadoop clusters on EC2 and it was working OK. However I didn't have the equivalent pile of physical hardware lying around to do a comparison ... which I guess is why EC2 is so attractive. Ta Liam This e-mail message may contain privileged and/or confidential information,= and is intended to be received only by persons entitled to receive such information. If you have received this e-mail in error, ple= ase notify the sender immediately. Please delete it and all attachments from any servers, hard drives or any other media. Other use= of this e-mail by you is strictly prohibited. All e-mails and attachments sent and received are subject to monitoring, re= ading and archival by Monsanto, including its subsidiaries. The recipient of this e-mail is solely responsible for checki= ng for the presence of "Viruses" or other "Malware". Monsanto, along with its subsidiaries, accepts no liability for any damage = caused by any such code transmitted by or accompanying this e-mail or any attachment. The information contained in this email may be subject to the export contro= l laws and regulations of the United States, potentially including but not limited to the Export Administration Regulations (EAR) an= d sanctions regulations issued by the U.S. Department of Treasury, Office of Foreign Asset Controls (OFAC). As a recipient of this = information you are obligated to comply with all applicable U.S. export laws and regulations.