Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 92DAC8D70 for ; Wed, 10 Aug 2011 15:01:37 +0000 (UTC) Received: (qmail 32769 invoked by uid 500); 10 Aug 2011 15:01:36 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 32683 invoked by uid 500); 10 Aug 2011 15:01:35 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 32675 invoked by uid 99); 10 Aug 2011 15:01:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Aug 2011 15:01:35 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of dariusz.czerski@gmail.com designates 209.85.215.174 as permitted sender) Received: from [209.85.215.174] (HELO mail-ey0-f174.google.com) (209.85.215.174) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Aug 2011 15:01:27 +0000 Received: by eyx24 with SMTP id 24so712914eyx.5 for ; Wed, 10 Aug 2011 08:01:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; bh=DS6gn6rKyHss4/Hw9tjBY3EHBhbnc+P6npV/8s3atmE=; b=VT6gH0fUadSB/whWy6koOGficUoO8UUrbecnQ+fuTsf29SlJxkMcYJ3EmDpFdiGHEQ p0FnHE2nW6KKr+8SrdVVoOysvRAlZRwE97CxGVA/PNSHPVW8+lSDIQ951//+Wj2S3W2z y0z2WBEUuSucXFHqigkIqrG99dyT4dkVOanYs= MIME-Version: 1.0 Received: by 10.204.30.207 with SMTP id v15mr1651494bkc.14.1312988467741; Wed, 10 Aug 2011 08:01:07 -0700 (PDT) Sender: dariusz.czerski@gmail.com Received: by 10.204.51.193 with HTTP; Wed, 10 Aug 2011 08:01:07 -0700 (PDT) Date: Wed, 10 Aug 2011 17:01:07 +0200 X-Google-Sender-Auth: MGitPylO38EDG8WyljCHV3rjh5A Message-ID: Subject: Cannot read from file after server crash From: Dariusz Czerski To: hdfs-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Hi, I have a hadoop cluster of 5 servers, version 0.21.0. On one of the servers i have a program which is writing to some file in HDFS, calling hflush and hsync after every write operation. When this server crashed (for example by power off) killing both DataNode and my little program, the data written to the file cannot be read for some time by another client. Duplication factor is set to 3. Situation changes after some time (several minutes), as i susspect NameNode needs time to proceed file recovering. When i try to append to this broken file, the AlreadyBeingCreatedException or RecoveryInProgressException exception occurring, but at the same time reading from the file gets no data (without any exeption). Is this situation correct? Is there any way to check that this file is in the recovery state? If yes, program can wait some time for recovery finish. Thanks Darek C