hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joey Echeverria <j...@cloudera.com>
Subject Re: replicating existing blocks?
Date Thu, 19 May 2011 17:50:50 GMT
The later. Replication of 1 means there's only one copy of any given
data block. If you lose that replica, you lose your data.

-Joey

On Thu, May 19, 2011 at 9:44 AM, Steve Cohen <mail4steve@gmail.com> wrote:
> One last question about these replication values. If dfs.replication
> and mapred.submit.replication are set to 1, does that mean they get
> copied one time so there are two dfs blocks and two job files or does
> it mean there is one dfs block and one job file?
>
> Thanks,
> Steve Cohen
>
> On Thu, May 19, 2011 at 2:43 AM, Friso van Vollenhoven
> <fvanvollenhoven@xebia.com> wrote:
>> I believe it's this:
>>
>> <property>
>>  <name>mapred.submit.replication</name>
>>  <value>10</value>
>>  <description>The replication level for submitted job files.  This
>>  should be around the square root of the number of nodes.
>>  </description>
>> </property>
>>
>> You can set it per job in the job specific conf and/or in mapred-site.xml.
>>
>>
>> Friso
>>
>>
>>
>> On 19 mei 2011, at 03:42, Steve Cohen wrote:
>>
>>> Where is the default replication factor on job files set? Is it different then
the dfs.replication setting in hdfs-site.xml?
>>>
>>> Sent from my iPad
>>>
>>> On May 18, 2011, at 9:10 PM, Joey Echeverria <joey@cloudera.com> wrote:
>>>
>>>> Did you run a map reduce job?
>>>>
>>>> I think the default replication factor on job files is 10, which
>>>> obviously doesn't work well on a psuedo-distributed cluster.
>>>>
>>>> -Joey
>>>>
>>>> On Wed, May 18, 2011 at 5:07 PM, Steve Cohen <mail4steve@gmail.com>
wrote:
>>>>> Thanks for the answer. Earlier, I asked about why I get occasional not
replicated yet errors. Now, I had dfs.replication set to one. What replication could it have
been doing? Did the error messages actually mean that the file couldn't get created in the
cluster?
>>>>>
>>>>> Thanks,
>>>>> Steve Cohen
>>>>>
>>>>>
>>>>>
>>>>> On May 18, 2011, at 6:39 PM, Todd Lipcon <todd@cloudera.com> wrote:
>>>>>
>>>>>> Tried to send this, but apparently SpamAssassin finds emails about
>>>>>> "replicas" to be spammy. This time with less rich text :)
>>>>>>
>>>>>> On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>>
>>>>>>> Hi Steve,
>>>>>>> Running setrep will indeed change those files. Changing "dfs.replication"
just changes the default replication value for files created in the future. Replication level
is a file-specific property.
>>>>>>> Thanks
>>>>>>> -Todd
>>>>>>>
>>>>>>> On Wed, May 18, 2011 at 3:32 PM, Steve Cohen <mail4steve@gmail.com>
wrote:
>>>>>>>>
>>>>>>>> Say I add a datanode to a pseudo cluster and I want to change
the
>>>>>>>> replication factor to 2. I see that I can either run hadoop
fs -setrep
>>>>>>>> or change the hdfs-site.xml value for dfs.replication. But
do either
>>>>>>>> of these cause the existing blocks to replicate?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Steve Cohen
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Todd Lipcon
>>>>>>> Software Engineer, Cloudera
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Todd Lipcon
>>>>>> Software Engineer, Cloudera
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Joseph Echeverria
>>>> Cloudera, Inc.
>>>> 443.305.9434
>>
>>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Mime
View raw message