hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Cohen <mail4st...@gmail.com>
Subject Re: replicating existing blocks?
Date Thu, 19 May 2011 16:44:21 GMT
One last question about these replication values. If dfs.replication
and mapred.submit.replication are set to 1, does that mean they get
copied one time so there are two dfs blocks and two job files or does
it mean there is one dfs block and one job file?

Thanks,
Steve Cohen

On Thu, May 19, 2011 at 2:43 AM, Friso van Vollenhoven
<fvanvollenhoven@xebia.com> wrote:
> I believe it's this:
>
> <property>
>  <name>mapred.submit.replication</name>
>  <value>10</value>
>  <description>The replication level for submitted job files.  This
>  should be around the square root of the number of nodes.
>  </description>
> </property>
>
> You can set it per job in the job specific conf and/or in mapred-site.xml.
>
>
> Friso
>
>
>
> On 19 mei 2011, at 03:42, Steve Cohen wrote:
>
>> Where is the default replication factor on job files set? Is it different then the
dfs.replication setting in hdfs-site.xml?
>>
>> Sent from my iPad
>>
>> On May 18, 2011, at 9:10 PM, Joey Echeverria <joey@cloudera.com> wrote:
>>
>>> Did you run a map reduce job?
>>>
>>> I think the default replication factor on job files is 10, which
>>> obviously doesn't work well on a psuedo-distributed cluster.
>>>
>>> -Joey
>>>
>>> On Wed, May 18, 2011 at 5:07 PM, Steve Cohen <mail4steve@gmail.com> wrote:
>>>> Thanks for the answer. Earlier, I asked about why I get occasional not replicated
yet errors. Now, I had dfs.replication set to one. What replication could it have been doing?
Did the error messages actually mean that the file couldn't get created in the cluster?
>>>>
>>>> Thanks,
>>>> Steve Cohen
>>>>
>>>>
>>>>
>>>> On May 18, 2011, at 6:39 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>>>
>>>>> Tried to send this, but apparently SpamAssassin finds emails about
>>>>> "replicas" to be spammy. This time with less rich text :)
>>>>>
>>>>> On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>
>>>>>> Hi Steve,
>>>>>> Running setrep will indeed change those files. Changing "dfs.replication"
just changes the default replication value for files created in the future. Replication level
is a file-specific property.
>>>>>> Thanks
>>>>>> -Todd
>>>>>>
>>>>>> On Wed, May 18, 2011 at 3:32 PM, Steve Cohen <mail4steve@gmail.com>
wrote:
>>>>>>>
>>>>>>> Say I add a datanode to a pseudo cluster and I want to change
the
>>>>>>> replication factor to 2. I see that I can either run hadoop fs
-setrep
>>>>>>> or change the hdfs-site.xml value for dfs.replication. But do
either
>>>>>>> of these cause the existing blocks to replicate?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Steve Cohen
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Todd Lipcon
>>>>> Software Engineer, Cloudera
>>>>
>>>
>>>
>>>
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>
>

Mime
View raw message