Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hadoop.apache.org
Received-SPF: pass (athena.apache.org: domain of silvianhadoop@gmail.com
 designates 209.85.215.48 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAOcnVr2_xYkPZYeMNPeSihxtLawbh_zVGRy_tVsxT0smrX2McA@mail.gmail.com>
References: 
 <CAMWNowH6az+SHesSrYn5zdcx5DsqX9Ra3a9XTbUi-xU53SXrEQ@mail.gmail.com>
	<CABCYYb8Qn6rTKqRJqau4AWtpAM0JFzUDMPYEJUPGMh_VLOBrqQ@mail.gmail.com>
	<CAOcnVr1bYgtvpWVqsqjM4sAOuTav3f6vtY2sb1v45b4TUOXBew@mail.gmail.com>
	<CABCYYb9yOjYH9CKGJ3BeR+JB_w_ZfpQUX8pBN+9q6834U0pyrA@mail.gmail.com>
	<CAMWNowH_sT26iVvRPF0_rZ2-qbOyBReGPFC7cBhqLfvu63-mMA@mail.gmail.com>
	<CAMWNowGAEr5ZYO44FV_CzP=Nr8hj2nAF7UaGhuTY2zmBZy3KvA@mail.gmail.com>
	<CAOcnVr2_xYkPZYeMNPeSihxtLawbh_zVGRy_tVsxT0smrX2McA@mail.gmail.com>
Date: Tue, 16 Oct 2012 00:27:40 -0700
Message-ID: 
 <CAMWNowGMSg-xAets=Fy9ajp9SW+64DOk38WHiRBgs7XsmdggCg@mail.gmail.com>
Subject: Re: final the dfs.replication and fsck
From: Patai Sangbutsarakum <silvianhadoop@gmail.com>
To: user@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1

Thanks you so much for confirming that.

On Mon, Oct 15, 2012 at 9:25 PM, Harsh J <harsh@cloudera.com> wrote:
> Patai,
>
> My bad - that was on my mind but I missed noting it down on my earlier
> reply. Yes you'd have to control that as well. 2 should be fine for
> smaller clusters.
>
> On Tue, Oct 16, 2012 at 5:32 AM, Patai Sangbutsarakum
> <silvianhadoop@gmail.com> wrote:
>> Just want to share & check if this is make sense.
>>
>> Job was failed to run after i restarted the namenode and the cluster
>> stopped complain about under-replication.
>>
>> this is what i found in log file
>>
>> Requested replication 10 exceeds maximum 2
>> java.io.IOException: file
>> /tmp/hadoop-apps/mapred/staging/apps/.staging/job_201210151601_0494/job.jar.
>> Requested replication 10 exceeds maximum 2
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.verifyReplication(FSNamesystem.java:1126)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplicationInternal(FSNamesystem.java:1074)
>>         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setReplication(FSNamesystem.java:1059)
>>         at org.apache.hadoop.hdfs.server.namenode.NameNode.setReplication(NameNode.java:629)
>>         at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>         at java.lang.reflect.Method.invoke(Method.java:597)
>>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:557)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1434)
>>         at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:143
>>
>>
>> So, i scanned though those xml config files, and guess to change
>> <name>mapred.submit.replication</name> from 10 to 2, and restarted again.
>>
>> That's when jobs can start running again.
>> Hopefully that change is make sense.
>>
>>
>> Thanks
>> Patai
>>
>> On Mon, Oct 15, 2012 at 1:57 PM, Patai Sangbutsarakum
>> <silvianhadoop@gmail.com> wrote:
>>> Thanks Harsh, dfs.replication.max does do the magic!!
>>>
>>> On Mon, Oct 15, 2012 at 1:19 PM, Chris Nauroth <cnauroth@hortonworks.com> wrote:
>>>> Thank you, Harsh.  I did not know about dfs.replication.max.
>>>>
>>>>
>>>> On Mon, Oct 15, 2012 at 12:23 PM, Harsh J <harsh@cloudera.com> wrote:
>>>>>
>>>>> Hey Chris,
>>>>>
>>>>> The dfs.replication param is an exception to the <final> config
>>>>> feature. If one uses the FileSystem API, one can pass in any short
>>>>> value they want the replication to be. This bypasses the
>>>>> configuration, and the configuration (being per-file) is also client
>>>>> sided.
>>>>>
>>>>> The right way for an administrator to enforce a "max" replication
>>>>> value at a create/setRep level, would be to set
>>>>> the dfs.replication.max to a desired value at the NameNode and restart
>>>>> it.
>>>>>
>>>>> On Tue, Oct 16, 2012 at 12:48 AM, Chris Nauroth
>>>>> <cnauroth@hortonworks.com> wrote:
>>>>> > Hello Patai,
>>>>> >
>>>>> > Has your configuration file change been copied to all nodes in the
>>>>> > cluster?
>>>>> >
>>>>> > Are there applications connecting from outside of the cluster?  If so,
>>>>> > then
>>>>> > those clients could have separate configuration files or code setting
>>>>> > dfs.replication (and other configuration properties).  These would not
>>>>> > be
>>>>> > limited by final declarations in the cluster's configuration files.
>>>>> > <final>true</final> controls configuration file resource loading, but it
>>>>> > does not necessarily block different nodes or different applications
>>>>> > from
>>>>> > running with completely different configurations.
>>>>> >
>>>>> > Hope this helps,
>>>>> > --Chris
>>>>> >
>>>>> >
>>>>> > On Mon, Oct 15, 2012 at 12:01 PM, Patai Sangbutsarakum
>>>>> > <silvianhadoop@gmail.com> wrote:
>>>>> >>
>>>>> >> Hi Hadoopers,
>>>>> >>
>>>>> >> I have
>>>>> >> <property>
>>>>> >>     <name>dfs.replication</name>
>>>>> >>     <value>2</value>
>>>>> >>     <final>true</final>
>>>>> >>   </property>
>>>>> >>
>>>>> >> set in hdfs-site.xml in staging environment cluster. while the staging
>>>>> >> cluster is running the code that will later be deployed in production,
>>>>> >> those code is trying to have dfs.replication of 3, 10, 50, other than
>>>>> >> 2; the number that developer thought that will fit in production
>>>>> >> environment.
>>>>> >>
>>>>> >> Even though I final the property dfs.replication in staging cluster
>>>>> >> already. every time i run fsck on the staging cluster i still see it
>>>>> >> said under replication.
>>>>> >> I thought final keyword will not honor value in job config, but it
>>>>> >> doesn't seem so when i run fsck.
>>>>> >>
>>>>> >> I am on cdh3u4.
>>>>> >>
>>>>> >> please suggest.
>>>>> >> Patai
>>>>> >
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Harsh J
>>>>
>>>>
>
>
>
> --
> Harsh J