Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 636309EA6 for ; Thu, 26 Jan 2012 23:23:47 +0000 (UTC) Received: (qmail 3948 invoked by uid 500); 26 Jan 2012 23:23:46 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 3844 invoked by uid 500); 26 Jan 2012 23:23:45 -0000 Mailing-List: contact hdfs-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-user@hadoop.apache.org Delivered-To: mailing list hdfs-user@hadoop.apache.org Received: (qmail 3831 invoked by uid 99); 26 Jan 2012 23:23:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jan 2012 23:23:45 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jenvor@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-we0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jan 2012 23:23:39 +0000 Received: by wejx9 with SMTP id x9so1194651wej.35 for ; Thu, 26 Jan 2012 15:23:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=1UOpPQ5DRL8CouodqCxIzAjSE/UNhr789sSZ+oiIhEw=; b=xydlTYOnQ+06k4HOMSc5wnUSz1Ff610BrP4tKkTG4SkgAuNucSPIrJGHLmcLhDKb3b MS+WY7OjXUgaw7+8owWZtoL16K5Sxh+Fj4Ur+XuQtXFCc6tQ94bSLhZJ6MyMO1ar72VP koBfIFDwBZM1/ChkT2gOASHP1cSYA3o9vDU9w= MIME-Version: 1.0 Received: by 10.216.136.195 with SMTP id w45mr1801244wei.20.1327620199385; Thu, 26 Jan 2012 15:23:19 -0800 (PST) Received: by 10.216.48.19 with HTTP; Thu, 26 Jan 2012 15:23:19 -0800 (PST) In-Reply-To: References: Date: Thu, 26 Jan 2012 18:23:19 -0500 Message-ID: Subject: Re: Replication is done synchronously or asynchronously? From: "Zhenhua (Gerald) Guo" To: hdfs-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Thanks, Harsh J. Your answer is quite helpful! If I understand right, writes wait until all replicas are created if there is no error during the replication process. If there is any error in the replication pipeline, dfs.replication.min comes into play . Is my understanding correct? Gerald On Thu, Jan 26, 2012 at 4:07 PM, Harsh J wrote: > Hi, > > On Fri, Jan 27, 2012 at 12:27 AM, Zhenhua (Gerald) Guo wrote: >> I have two questions regarding creation of replicas. >> - When a user uploads a file to HDFS, it returns whenever the first >> replica is created? or the client needs wait until all replicas are >> created? >> - When the output of MapReduce jobs is written to HDFS (by reduce >> tasks), the writing of output returns when the first replica is >> created? or wait until all replicas are created? > > Both questions are the same as both do the same form of DFS write. > > Writes are synchronous and replication is pipelined, presently in Apache Hadoop. > > But a write will succeed if at least 1 replica was written (controlled > via dfs.replication.min -- pipeline can lose DNs out of errors, or can > get fewer than requested DNs cause of load/space issues, but write > will succeed if it at least gets one DN) > > Also see the whole conversation at > http://search-hadoop.com/m/bF99W1ZmNqz1 for some more tidbits you > might find interesting. > > -- > Harsh J > Customer Ops. Engineer, Cloudera