Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 54197 invoked from network); 16 Sep 2009 18:40:03 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Sep 2009 18:40:03 -0000 Received: (qmail 42998 invoked by uid 500); 16 Sep 2009 18:40:01 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 42926 invoked by uid 500); 16 Sep 2009 18:40:01 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 42916 invoked by uid 99); 16 Sep 2009 18:40:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Sep 2009 18:40:00 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of edlinuxguru@gmail.com designates 209.85.218.214 as permitted sender) Received: from [209.85.218.214] (HELO mail-bw0-f214.google.com) (209.85.218.214) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Sep 2009 18:39:51 +0000 Received: by bwz10 with SMTP id 10so3863225bwz.29 for ; Wed, 16 Sep 2009 11:39:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=ws5qJ+Bq35GI5rPRCzgD8YAxIfIB4XuaUoMshPHqdN8=; b=WpyaWPqlZbuek654IhyGTgig0jka0lQ67wvpnf8fpRLOJ0efQCbqdusamU6+2hWDwL +/48sKvLSdnJX58NZPymhIXyfymE+2jILPjQBz+j+I3dOeZE/NMFlDq56/uuudFiO1hu Hu4ToUkNg3MObbv/Q5IWyIxEgMqb/Eijqu9Ko= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=UPJvlty2qUxwUaPjmYox8Lttx1810KEZrlrv7zDwuzlTnlW6l38InYhO3UjlBVkNv9 GAgzlykb2B4hyiIM6EEccfF6OjKnEeT+ZdiURcxochPUN9HyPIxaXdyclBPL2Fk9V66N /Ef2OhUWHhDTpZXaW3zTGswHudOJ5vW8B6Qig= MIME-Version: 1.0 Received: by 10.239.143.209 with SMTP id l17mr855933hba.165.1253126370060; Wed, 16 Sep 2009 11:39:30 -0700 (PDT) In-Reply-To: <4AB0CB9E.5060400@apache.org> References: <6F5C1D715B2DA5498A628E6B9C124F04014526A203@hasmsx504.ger.corp.intel.com> <4AB0CB9E.5060400@apache.org> Date: Wed, 16 Sep 2009 14:39:30 -0400 Message-ID: Subject: Re: Stretched HDFS cluster From: Edward Capriolo To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org On Wed, Sep 16, 2009 at 7:27 AM, Steve Loughran wrote: > Touretsky, Gregory wrote: >> >> Hi, >> >> =A0 =A0Does anyone have an experience running HDFS cluster stretched ove= r >> high-latency WAN connections? >> Any specific concerns/options/recommendations? >> I'm trying to setup the HDFS cluster with the nodes located in the US, >> Israel and India - considering it as a potential solution for cross-site >> data sharing... >> > > I would back up todd here and say "don't do it -yet". I think there are s= ome > minor placeholders in the rack hierarchy to have an explicit notion of > different sites, but nobody has done the work yet. Cross datacentre data > balancing and work scheduling is complex, and all the code in Hadoop, > zookeeper, etc, is built on the assumption that latency is low, all machi= nes > clocks are going forward at roughly the same rate, the network is fairly > reliable, routers are unlikely to corrupt data, etc. > > Now, if you do want to do >1 site, it would be a profound and useful > development -I'd expect the MR scheduler, or even the Pig/Hive code > generators , to take datacentre locality into account, doing as much work > per site as possible. The problem of block distribution changes too, as y= ou > would want 1 copy of each block in the other datacentre. Even then, I'd > start with sites in a single city, on a MAE or other link where bandwidth > matters less. Note that (as discussed below) on the MAN scale things can > start to go wrong in ways that are unlikely in a datacentre, and its thos= e > failures that will burn you > > worth reading > http://status.aws.amazon.com/s3-20080720.html > http://www.allthingsdistributed.com/2008/12/eventually_consistent.html > > -Steve > On a somewhat related topic I was showing a co-worker a Hadoop setup and he asked stated, "What if we got a bunch of laptops on the internet like the playstation 'Folding @ Home'" of course these are widely different distributed models. I have been thinking about this. Assume: Throw the data security out the window, and assume everyone is playing fair= . Assume we have systems with a semi-dedicated IP, like my cable internet. with no inbound/outbound restrictions. Assume every computer is its own RACK LAN is very low latency Assume that latency is like 40 ms Assume we tune up replication to say 10 or higher to deal with drop on/drop= offs Could we run a 100 Node cluster? If no what is stopping it from working? My next question. For fun, does anyone want to try? We could setup IPTABLES/firewall allowing hadoop traffic from IP's in the experiment. I have two nodes in Chicago, US ISP to donate. Even if we get 10 nodes that would be interesting as a benchmark.