Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 68412 invoked from network); 29 Sep 2009 17:20:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 29 Sep 2009 17:20:20 -0000 Received: (qmail 74143 invoked by uid 500); 29 Sep 2009 17:20:17 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 74072 invoked by uid 500); 29 Sep 2009 17:20:17 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 74062 invoked by uid 500); 29 Sep 2009 17:20:17 -0000 Delivered-To: apmail-hadoop-core-user@hadoop.apache.org Received: (qmail 74059 invoked by uid 99); 29 Sep 2009 17:20:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Sep 2009 17:20:17 +0000 X-ASF-Spam-Status: No, hits=3.4 required=10.0 tests=HTML_MESSAGE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 209.85.222.198 is neither permitted nor denied by domain of kpeterson@biz360.com) Received: from [209.85.222.198] (HELO mail-pz0-f198.google.com) (209.85.222.198) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 29 Sep 2009 17:20:09 +0000 Received: by pzk36 with SMTP id 36so1198410pzk.5 for ; Tue, 29 Sep 2009 10:19:49 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.151.8 with SMTP id y8mr372275wfd.65.1254244789094; Tue, 29 Sep 2009 10:19:49 -0700 (PDT) Date: Tue, 29 Sep 2009 10:19:49 -0700 Message-ID: Subject: Which instance type on Amazon EC2? From: Kevin Peterson To: core-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000e0cd157ee1d8bc90474ba9f42 X-Virus-Checked: Checked by ClamAV on apache.org --000e0cd157ee1d8bc90474ba9f42 Content-Type: text/plain; charset=ISO-8859-1 Has anyone done any extensive testing of what instance types on Amazon EC2 give you the most bang for the buck? Given the normal Hadoop recommendations of beefy machines, I would expect the best performance from the extra-large, but our testing showed otherwise. We did some rough testing while we were just getting started with like a 10 node cluster, and we found that the extra large instance doesn't come close to twice the actual performance of the large instance (pricing at $0.80 and $0.40). My rationalization is that some of the resources are shared, and the extra-large instance corresponds to the actual hardware, while the large instance sometimes gets to take advantage of IO and network bandwidth beyond 50% when the other tenant isn't doing much. I'm revisiting our config because we're deploying HBase soon, and I'm not sure whether I would be better off going to the extra-large instances so that I can co-locate the tasktrackers and the region servers on the same nodes, or if I should stick with large instances and put hbase on separate servers. Mostly I'm wondering if my results were a fluke. --000e0cd157ee1d8bc90474ba9f42--