hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: Hadoop doesnt use Replication Level of Namenode
Date Tue, 13 Sep 2011 20:52:14 GMT
That won't work with the replication level as that is entirely a
client side config. You can partially control it by setting the
maximum replication level.


On Tue, Sep 13, 2011 at 10:56 AM, Edward Capriolo <edlinuxguru@gmail.com> wrote:
> On Tue, Sep 13, 2011 at 5:53 AM, Steve Loughran <stevel@apache.org> wrote:
>> On 13/09/11 05:02, Harsh J wrote:
>>> Ralf,
>>> There is no current way to 'fetch' a config at the moment. You have
>>> the NameNode's config available at NNHOST:WEBPORT/conf page which you
>>> can perhaps save as a resource (dynamically) and load into your
>>> Configuration instance, but apart from this hack the only other ways
>>> are the ones Bharath mentioned. This might lead to slow start ups of
>>> your clients, but would give you the result you want.
>> I've done it a modified version of Hadoop, all it takes is a servlet in the
>> NN. It even served up the live data of the addresses and ports a NN was
>> running on, even if it didn't know in advance.
> Another technique is that if you are using a single replication factor on
> all files you can mark the property as <final>true</final> in the
> configuration of the NameNode and DataNode. This will always override the
> client settings. However in general it is best to manage client
> configurations as carefully as you manage the server ones, and ensure that
> you give clients the configuration they MUST use puppet/cfengine etc.
> Essentially do not count on a client to get them right because the risk is
> too high if they are set wrong. IE your situation. "I thought everything was
> replicated 3 times"

Joseph Echeverria
Cloudera, Inc.

View raw message