Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 572BCDD67 for ; Wed, 31 Oct 2012 05:21:28 +0000 (UTC) Received: (qmail 34673 invoked by uid 500); 31 Oct 2012 05:21:23 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 34484 invoked by uid 500); 31 Oct 2012 05:21:22 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 34457 invoked by uid 99); 31 Oct 2012 05:21:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Oct 2012 05:21:21 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of harsh@cloudera.com designates 209.85.210.176 as permitted sender) Received: from [209.85.210.176] (HELO mail-ia0-f176.google.com) (209.85.210.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 31 Oct 2012 05:21:13 +0000 Received: by mail-ia0-f176.google.com with SMTP id h11so874219iae.35 for ; Tue, 30 Oct 2012 22:20:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type:x-gm-message-state; bh=X4XvW6F7Tt6S7SSAkZq3oM1SPZtLzYefmFUtNjeUw7M=; b=AlsXbRET9J8OSggvN/ovBrXCDicSvYDB6baFATdGAIYbziYMn/fhFTaTy54KeTHYaC JO6M/iq4mhU3+mUpQPu0PFFqB6sFE0wiKOOheos2uEA2LbbT9Nf0xTIjvnMJ1lG8t86M nE1xeRtasTEKtZjm3DkbbLsnUo/bI4tDWXcnAjDqxzUCemT1aK3k30UdhWhHRIMZKUeE 28dEaN0am/lO7JlsKtkSocQBO06Qtl4HWUKbe19VC0X466IoaY6H7ZnFDtLHhGVYkDRX ss6sUwm6u6Uf0s/J+hrVptxij0fx5pgrA5P3GURZ/rGD2uWa6dtw1dsL/fQ8kz3aYPeX C8Jw== Received: by 10.50.183.167 with SMTP id en7mr340226igc.49.1351660852579; Tue, 30 Oct 2012 22:20:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.64.27.8 with HTTP; Tue, 30 Oct 2012 22:20:32 -0700 (PDT) In-Reply-To: References: From: Harsh J Date: Wed, 31 Oct 2012 10:50:32 +0530 Message-ID: Subject: Re: Replication To: user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Gm-Message-State: ALoCoQkAdJJeoBwBLCQbn5kLYF8ruBiFAnC/SJT7V+M2XVkhQGERuM7tMvzZ97XoJAUiBjKr7tVN X-Virus-Checked: Checked by ClamAV on apache.org Hi, Yes if you are purely a regular client (non DN box) writing to HDFS, then the chosen DNs are selected at random (but fit within policy of cross-rack writes, if it applies to your environment). On Wed, Oct 31, 2012 at 6:43 AM, Mohit Anchlia wrote: > Thanks and if it is not the datanode then I am guessing namenode decides the > nodes in replication pipeline? > > > On Tue, Oct 30, 2012 at 5:36 PM, ranjith raghunath > wrote: >> >> If your client node is a datanode with your cluster then the first copy >> does get written to that data node. >> >> Experts please feel free to correct me here. >> >> On Oct 30, 2012 7:11 PM, "Mohit Anchlia" wrote: >>> >>> With respect to replication if I run pig job from one of the nodes within >>> the Hadoop cluster then do I always end up with writing 1 replica copy to >>> that client node always and remaining 2 replica copies to other nodes? >>> > > -- Harsh J