Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DC3D4104B4 for ; Thu, 22 Aug 2013 08:22:28 +0000 (UTC) Received: (qmail 65801 invoked by uid 500); 22 Aug 2013 08:22:22 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 65414 invoked by uid 500); 22 Aug 2013 08:22:21 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 65406 invoked by uid 99); 22 Aug 2013 08:22:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Aug 2013 08:22:20 +0000 X-ASF-Spam-Status: No, hits=2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nkeywal@gmail.com designates 74.125.82.173 as permitted sender) Received: from [74.125.82.173] (HELO mail-we0-f173.google.com) (74.125.82.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Aug 2013 08:22:16 +0000 Received: by mail-we0-f173.google.com with SMTP id x54so1328758wes.4 for ; Thu, 22 Aug 2013 01:21:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=PHlER7IJfyq0AOH2Y6OdWOBtt4NitCrTt0beRYxm/nQ=; b=vLCu8WOImXVQjWyh9IPY9clcqeghtDKLi1gizi3dEGnPYwJl/ky+i8k0m3NP6S8YrD p55icGEoRA5q4rTDrs4I65nrwLyrDBn5qeco4C68TDJYEQbjUfdmCkg5QATCRtsBnIr/ 5JP6bmnOgExSKeJo1NbsYo0HAqs/wS9ffmN9MC6NnYbEe7fT/bmaIyW6iQITFLrpD6tb 4AxHjr5NRxOQ2HsbQKHfB7OrkWcSGA9XY3unJpEQwuxXYBr+BHLgnpZXbe8P5FtvBp8R tUmnwBcZtnYYr7hhx7unSj+d7qC9aMJgz3P4TqvyTe3j2FG+FiVSJ/M9iWqdUzGPbQPs Z/Tg== X-Received: by 10.180.109.167 with SMTP id ht7mr8667224wib.45.1377159714993; Thu, 22 Aug 2013 01:21:54 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.176.70 with HTTP; Thu, 22 Aug 2013 01:21:34 -0700 (PDT) In-Reply-To: <1377157312625-4086029.post@n3.nabble.com> References: <1377157312625-4086029.post@n3.nabble.com> From: Nicolas Liochon Date: Thu, 22 Aug 2013 10:21:34 +0200 Message-ID: Subject: Re: rack awarness unexpected behaviour To: common-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=e89a8f3ba7e19c7ad004e484fbe3 X-Virus-Checked: Checked by ClamAV on apache.org --e89a8f3ba7e19c7ad004e484fbe3 Content-Type: text/plain; charset=ISO-8859-1 Do the jobs run on the whole cluster or a single rack? If you write from a single rack, you will get something similar to what you described, because the default policy is to put one block locally and 2 blocks on the same remote rack. It does check that there is enough place available, but does not try to balance. On Thu, Aug 22, 2013 at 9:41 AM, Marc Sturlese wrote: > Hey there, > I've set up rack awareness on my hadoop cluster with replication 3. I have > 2 > racks and each contains 50% of the nodes. > I can see that the blocks are spread on the 2 racks, the problem is that > all > nodes from a rack are storing 2 replicas and the nodes of the other rack > just one. If I launch the hadoop balancer script, it will properly spread > the replicas across the 2 racks, leaving all nodes with exactly the same > available disk space but, after jobs are running for hours, the data will > be > unbalanced again (rack1 having all nodes with less empty disk space than > all > nodes from rack2) > > Any clue whats going on? > Thanks in advance > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/rack-awarness-unexpected-behaviour-tp4086029.html > Sent from the Hadoop lucene-users mailing list archive at Nabble.com. > --e89a8f3ba7e19c7ad004e484fbe3--