Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B7578DF00 for ; Fri, 17 May 2013 05:12:57 +0000 (UTC) Received: (qmail 59758 invoked by uid 500); 17 May 2013 05:12:53 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 59239 invoked by uid 500); 17 May 2013 05:12:48 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 59211 invoked by uid 99); 17 May 2013 05:12:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 May 2013 05:12:46 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.210.180 as permitted sender) Received: from [209.85.210.180] (HELO mail-ia0-f180.google.com) (209.85.210.180) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 May 2013 05:12:41 +0000 Received: by mail-ia0-f180.google.com with SMTP id l27so4468647iae.11 for ; Thu, 16 May 2013 22:12:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=rOjq7LJEdO+50PCZy96tQLn6doULcMhEJY0n/enOIQE=; b=omE9Oe9S+vsu2SpfB8VEAhmOPDOWr3SX+mAIKfCV1rtSTkwv9PZRjwuNJhZCSyvc17 9IlfsFQijAjfV4ngnFv72PaStmvQo3GILbZp3XARbhhCq+Euq/GeKwaWFp7Y1JQ3EW1A O1f89PH0l6P04Y6go5q4rdU1RUX6OTMtj+E2v6htxmJOKJK4j5nosHfxCZnt5XWfpvJv sF7dVx8UHdwaA5rIZPKtd1NhiYs3ueqwWTo431dMZIobBdKvXp8gJgdNwOIFQ5aX+prd utsbH3qdDHtjrRlXgX6a7VE1+eD3PkzbZAzs0xaGr1NhH17hv8VwYsShhwF7A7clmjeN vf6w== X-Received: by 10.50.12.3 with SMTP id u3mr12099694igb.1.1368767540886; Thu, 16 May 2013 22:12:20 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.51.164 with HTTP; Thu, 16 May 2013 22:12:00 -0700 (PDT) In-Reply-To: References: <869970D71E26D7498BDAC4E1CA92226B658987A3@MBX021-E3-NJ-2.exch021.domain.local> From: Harsh J Date: Fri, 17 May 2013 10:42:00 +0530 Message-ID: Subject: Re: Question about writing HDFS files To: "" Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQl5HhyP7VJIYVmlpbl4mR+C+Z465dLd17XnmkfCT+W1XgnHUN7rDyg8vpolyJvDsAusSRnq X-Virus-Checked: Checked by ClamAV on apache.org Thanks for the clarification Rahul. In that case, then the reading is correct (and that a HDFS client behaves the same, in and out of MR - its not really related to MR at all). A "client outside" would write to a random set of datanode, across at least two racks for 3 replicas if rack awareness is turned on. On Fri, May 17, 2013 at 8:17 AM, Rahul Bhattacharjee wrote: > Hi Harsh, > > I think what John meant by writing to local disk is writing to the same data > node first which has initiated the write call. > > John can further clarify. > > > On Fri, May 17, 2013 at 4:23 AM, Harsh J wrote: >> >> That is not true. HDFS writes are not staged to a local disk first >> before being written onto the DataNodes. The old architecture docs >> seem to suggest that the writes get staged to a local disk but thats >> not true anymore, see https://issues.apache.org/jira/browse/HDFS-1454. >> >> Also worth noting that a HDFS client behaves the same way in almost >> all contexts, whether its invoked from an MR framework or directly >> from shell. >> >> On Fri, May 17, 2013 at 3:38 AM, John Lilley >> wrote: >> > I seem to recall reading that when a MapReduce task writes a file, the >> > blocks of the file are always written to local disk, and replicated to >> > other >> > nodes. If this is true, is this also true for non-MR applications >> > writing >> > to HDFS from Hadoop worker nodes? What about clients outside of the >> > cluster >> > doing a file load? >> > >> > Thanks >> > >> > John >> > >> > >> >> >> >> -- >> Harsh J > > -- Harsh J