Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 93C799E47 for ; Tue, 25 Sep 2012 20:15:59 +0000 (UTC) Received: (qmail 49455 invoked by uid 500); 25 Sep 2012 20:15:54 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 49388 invoked by uid 500); 25 Sep 2012 20:15:54 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 49380 invoked by uid 99); 25 Sep 2012 20:15:54 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 20:15:54 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FSL_RCVD_USER,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of adi@cloudera.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pb0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 25 Sep 2012 20:15:48 +0000 Received: by pbbrq13 with SMTP id rq13so851216pbb.35 for ; Tue, 25 Sep 2012 13:15:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=o6su7OT/09sodu1NnIL/JVmSx0jtwTWfoJyW6z9XvBs=; b=Jryj8ZCYv93gu5gP792d3QhiI+wp2F/sj2wTcqtn5oLWQH9SM4y/tTL3Ao8DGBg81a YU0rEnOTGutrkOEiiy00C29a3FjuD64J/rTJtk+6XgLUPR49oxuzU7GK1HAFOGztGO49 dk3p+yQjJqjNiJ9lYmk3ewNKwWWg2DmSBtjzPhbZBRJsyIdpfLRIaERWVT+OEhVNrcAb 1hmrGPsRHX3oNvmcP2L/9i42AUhx7IUyWgladEcakoCwdGCzUeX/3rIIzNW9VRftl/+c MztmzqZSkaVh5cEzJn0aXYzy8XTIh9FX0oR2AsFxL9vaOe0L8CxjQl1+nXucTkF/pFxX wkiw== MIME-Version: 1.0 Received: by 10.68.141.46 with SMTP id rl14mr48994985pbb.2.1348604126559; Tue, 25 Sep 2012 13:15:26 -0700 (PDT) Received: by 10.68.74.70 with HTTP; Tue, 25 Sep 2012 13:15:26 -0700 (PDT) In-Reply-To: <4A3B3466BCAEF24E80F8EB422B1EE0010F011593@MBX021-E3-NJ-6.exch021.domain.local> References: <4A3B3466BCAEF24E80F8EB422B1EE0010F011593@MBX021-E3-NJ-6.exch021.domain.local> Date: Tue, 25 Sep 2012 13:15:26 -0700 Message-ID: Subject: Re: Detect when file is not being written by another process From: Andy Isaacson To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQnvfVm9d6UsM3MMojWpMPe9tyUA/b892XIgB/NrHKauls09T8KouwmcqpuaVhBkDjLD3fkp On Tue, Sep 25, 2012 at 9:28 AM, Peter Sheridan wrote: > We're using Hadoop 1.0.3. We need to pick up a set of large (4+GB) files > when they've finished being written to HDFS by a different process. The common way to solve this problem is to modify the writing application to write to a temporary filename and then rename the temporary to the target filename when the write is complete. That way, if the file exists without the temporary tag, the reader can be confident the file is complete. -andy