Return-Path: X-Original-To: apmail-hadoop-common-dev-archive@www.apache.org Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 544A498DA for ; Wed, 23 Nov 2011 00:58:30 +0000 (UTC) Received: (qmail 52443 invoked by uid 500); 23 Nov 2011 00:58:28 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 52152 invoked by uid 500); 23 Nov 2011 00:58:27 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 51739 invoked by uid 99); 23 Nov 2011 00:58:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Nov 2011 00:58:27 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of todd@cloudera.com designates 209.85.215.48 as permitted sender) Received: from [209.85.215.48] (HELO mail-lpp01m010-f48.google.com) (209.85.215.48) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Nov 2011 00:58:22 +0000 Received: by lagv3 with SMTP id v3so12618lag.35 for ; Tue, 22 Nov 2011 16:58:00 -0800 (PST) Received: by 10.152.104.167 with SMTP id gf7mr13307903lab.46.1322009880563; Tue, 22 Nov 2011 16:58:00 -0800 (PST) MIME-Version: 1.0 Received: by 10.152.131.41 with HTTP; Tue, 22 Nov 2011 16:57:37 -0800 (PST) In-Reply-To: <1542FA4EE20C5048A5C2A3663BED2A6B515D7D@szxeml504-mbx.china.huawei.com> References: <1542FA4EE20C5048A5C2A3663BED2A6B515D7D@szxeml504-mbx.china.huawei.com> From: Todd Lipcon Date: Tue, 22 Nov 2011 16:57:37 -0800 Message-ID: Subject: Re: Blocks are getting corrupted under very high load To: common-dev@hadoop.apache.org Cc: "hdfs-dev@hadoop.apache.org" Content-Type: text/plain; charset=ISO-8859-1 Can you look on the DN in question and see whether it was succesfully finalized when the write finished? It doesn't sound like a successful write -- should have moved it out of the bbw directory into current/ -Todd On Tue, Nov 22, 2011 at 3:16 AM, Uma Maheswara Rao G wrote: > Hi All, > > > > I have backported HDFS-1779 to our Hadoop version which is based on 0.20-Append branch. > > We are running a load test, as usual. (We want to ensure the reliability of the system under heavy loads.) > My cluster has 8 DataNodes and a Namenode > Each machine has 16 CPUs and 12 hard disks, each having 2TB capacity. > Clients are running along with Datanodes. > Clients will upload some tar files containing 3-4 blocks, from 50 threads. > Each block size is 256MB. replication factor is 3. > > Everything looks to be fine on a normal load. > When the load is increased, lot of errors are happening. > Many pipeline failures are happening also. > All these are fine, except for the strange case of few blocks. > > Some blocks (around 30) are missing (FSCK report shows). > When I tried to read that files, it fails saying that No Datanodes for this block > Analysing the logs, we found that, for these blocks, pipeline recovery happened, write was successful to a single Datanode. > Also, Datanode reported the block to Namenode in a blockReceived command. > After some time (say, 30 minutes), the Datanode is getting restarted. > In the BBW (BlocksBeingWritten) report send by DN immediately after restart, these finalized blocks are also included. (Showing that these blocks are in blocksBeingWritten folder) > In many of the cases, the generation timestamp reported in the BBW report is the old timestamp. > > Namenode is rejecting that block in the BBW report by saying file is already closed. > Also, Namenode asks the Datanode to invlidate the blocks & Datanode is doing the same. > When deleting the blocks also, it is printing the path from BlocksBeingWritten directory. (Also the previous generation timestamp) > > Looks very strange for me. > Does this means that the finalized block file & meta file (which is written in current folder) is getting lost after DN restart > Due to which Namenode will not receive these block's information in the BLOCK REPORT send from the Datanodes. > > > > > > Regards, > > Uma > -- Todd Lipcon Software Engineer, Cloudera