Return-Path: X-Original-To: apmail-hadoop-common-user-archive@www.apache.org Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 771668618 for ; Tue, 13 Sep 2011 20:52:46 +0000 (UTC) Received: (qmail 17337 invoked by uid 500); 13 Sep 2011 20:52:43 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 17118 invoked by uid 500); 13 Sep 2011 20:52:43 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 17109 invoked by uid 99); 13 Sep 2011 20:52:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 20:52:42 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of joey@cloudera.com designates 209.85.161.48 as permitted sender) Received: from [209.85.161.48] (HELO mail-fx0-f48.google.com) (209.85.161.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Sep 2011 20:52:35 +0000 Received: by fxd23 with SMTP id 23so1401635fxd.35 for ; Tue, 13 Sep 2011 13:52:15 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.42.24 with SMTP id q24mr867538fae.148.1315947135127; Tue, 13 Sep 2011 13:52:15 -0700 (PDT) Received: by 10.223.83.8 with HTTP; Tue, 13 Sep 2011 13:52:14 -0700 (PDT) In-Reply-To: References: <4e6e8141.0f8c8e0a.4578.1cb6SMTPIN_ADDED@mx.google.com> <4E6F2821.2090801@apache.org> Date: Tue, 13 Sep 2011 16:52:14 -0400 Message-ID: Subject: Re: Hadoop doesnt use Replication Level of Namenode From: Joey Echeverria To: common-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org That won't work with the replication level as that is entirely a client side config. You can partially control it by setting the maximum replication level. -Joey On Tue, Sep 13, 2011 at 10:56 AM, Edward Capriolo wrote: > On Tue, Sep 13, 2011 at 5:53 AM, Steve Loughran wrote: > >> On 13/09/11 05:02, Harsh J wrote: >> >>> Ralf, >>> >>> There is no current way to 'fetch' a config at the moment. You have >>> the NameNode's config available at NNHOST:WEBPORT/conf page which you >>> can perhaps save as a resource (dynamically) and load into your >>> Configuration instance, but apart from this hack the only other ways >>> are the ones Bharath mentioned. This might lead to slow start ups of >>> your clients, but would give you the result you want. >>> >> >> I've done it a modified version of Hadoop, all it takes is a servlet in the >> NN. It even served up the live data of the addresses and ports a NN was >> running on, even if it didn't know in advance. >> >> > Another technique is that if you are using a single replication factor on > all files you can mark the property as true in the > configuration of the NameNode and DataNode. This will always override the > client settings. However in general it is best to manage client > configurations as carefully as you manage the server ones, and ensure that > you give clients the configuration they MUST use puppet/cfengine etc. > Essentially do not count on a client to get them right because the risk is > too high if they are set wrong. IE your situation. "I thought everything was > replicated 3 times" > -- Joseph Echeverria Cloudera, Inc. 443.305.9434