Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 20D8F6B07 for ; Thu, 19 May 2011 01:38:14 +0000 (UTC) Received: (qmail 48072 invoked by uid 500); 19 May 2011 01:38:13 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 48031 invoked by uid 500); 19 May 2011 01:38:13 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 48023 invoked by uid 99); 19 May 2011 01:38:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 May 2011 01:38:13 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,MIME_QP_LONG_LINE,RCVD_IN_DNSWL_LOW,RFC_ABUSE_POST,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of highpointe3i@gmail.com designates 209.85.213.48 as permitted sender) Received: from [209.85.213.48] (HELO mail-yw0-f48.google.com) (209.85.213.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 19 May 2011 01:38:04 +0000 Received: by ywk9 with SMTP id 9so1099328ywk.35 for ; Wed, 18 May 2011 18:37:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:subject:from:content-type:x-mailer:message-id :date:to:content-transfer-encoding:mime-version; bh=fuH4ng6MpUYndT6ytWGPH0oTjdNBRak96gzLDjJHkyg=; b=wxbbwtnN2x10jYUUioJLhWemKRhGNmKbhcBqZvDeOfrAKPL1dCveM/a5x6SggFVFtI QgdUb3pEsCoJE1Gs9kP/BTZ9h4RWXrUSQ6sDKRZqfCk43XbyqsW3QUCvG+nYZ7swA/sR 1CeTHDWjhGdQsVgTQAgnEAR11skrh7CBgfX7A= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=subject:from:content-type:x-mailer:message-id:date:to :content-transfer-encoding:mime-version; b=pdbq1onV/OWCHK5KKG0BlqkqJP5JC5G898QIRBCtEpUFz6UVEmc4cUTvkS/j+JhhLx agIVY1DqhgXfDw/g+VMxq3DrpgaBG/trfU0VS6E/Dpq+UMSB3cL2kClYX5W1cq0KBQsr FYnfLZR/lVHzdOwNvlQKANI+Ko6q8m+PveI98= Received: by 10.150.69.27 with SMTP id r27mr2066654yba.114.1305769063724; Wed, 18 May 2011 18:37:43 -0700 (PDT) Received: from [10.31.170.136] ([166.205.10.202]) by mx.google.com with ESMTPS id u27sm860831yba.21.2011.05.18.18.37.42 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 18 May 2011 18:37:43 -0700 (PDT) Subject: DFS maxing out on single datadir From: highpointe Content-Type: text/plain; charset=us-ascii X-Mailer: iPhone Mail (8J2) Message-Id: Date: Wed, 18 May 2011 19:37:35 -0600 To: "hdfs-user@hadoop.apache.org" Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (iPhone Mail 8J2) X-Virus-Checked: Checked by ClamAV on apache.org System: HDFS dirs across cluster as dataone, datatwo, datathree I recently had an issue where I lost a slave that resulted in a large amount= of under replicated blocks. The replication was quite slow on the uptake so I thought running a hadoop b= alancer would help. This seemed to exacerbate the situation so I killed the balancer.=20 Hadoop then proceeded to write all new data to dataone across each slave. It= would wait until the dataone dir was at 100% then move to the next slave in= sequence. datatwo and datathree were completely ignored. DFS showed <10% free and was quickly diving. I ended up restarting the entire cluster (DFS and MapRed) and things started= acting normal again (writing to all three replicants). Has anyone experienced this or have any idea why it would happen? Thanks for the help.=20 Sent from my iPhone=