hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Cohen <mail4st...@gmail.com>
Subject Re: replicating existing blocks?
Date Thu, 19 May 2011 17:53:11 GMT
Thanks for clarifying.

On Thu, May 19, 2011 at 1:50 PM, Joey Echeverria <joey@cloudera.com> wrote:
> The later. Replication of 1 means there's only one copy of any given
> data block. If you lose that replica, you lose your data.
>
> -Joey
>
> On Thu, May 19, 2011 at 9:44 AM, Steve Cohen <mail4steve@gmail.com> wrote:
>> One last question about these replication values. If dfs.replication
>> and mapred.submit.replication are set to 1, does that mean they get
>> copied one time so there are two dfs blocks and two job files or does
>> it mean there is one dfs block and one job file?
>>
>> Thanks,
>> Steve Cohen
>>
>> On Thu, May 19, 2011 at 2:43 AM, Friso van Vollenhoven
>> <fvanvollenhoven@xebia.com> wrote:
>>> I believe it's this:
>>>
>>> <property>
>>>  <name>mapred.submit.replication</name>
>>>  <value>10</value>
>>>  <description>The replication level for submitted job files.  This
>>>  should be around the square root of the number of nodes.
>>>  </description>
>>> </property>
>>>
>>> You can set it per job in the job specific conf and/or in mapred-site.xml.
>>>
>>>
>>> Friso
>>>
>>>
>>>
>>> On 19 mei 2011, at 03:42, Steve Cohen wrote:
>>>
>>>> Where is the default replication factor on job files set? Is it different
then the dfs.replication setting in hdfs-site.xml?
>>>>
>>>> Sent from my iPad
>>>>
>>>> On May 18, 2011, at 9:10 PM, Joey Echeverria <joey@cloudera.com> wrote:
>>>>
>>>>> Did you run a map reduce job?
>>>>>
>>>>> I think the default replication factor on job files is 10, which
>>>>> obviously doesn't work well on a psuedo-distributed cluster.
>>>>>
>>>>> -Joey
>>>>>
>>>>> On Wed, May 18, 2011 at 5:07 PM, Steve Cohen <mail4steve@gmail.com>
wrote:
>>>>>> Thanks for the answer. Earlier, I asked about why I get occasional
not replicated yet errors. Now, I had dfs.replication set to one. What replication could it
have been doing? Did the error messages actually mean that the file couldn't get created in
the cluster?
>>>>>>
>>>>>> Thanks,
>>>>>> Steve Cohen
>>>>>>
>>>>>>
>>>>>>
>>>>>> On May 18, 2011, at 6:39 PM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>
>>>>>>> Tried to send this, but apparently SpamAssassin finds emails
about
>>>>>>> "replicas" to be spammy. This time with less rich text :)
>>>>>>>
>>>>>>> On Wed, May 18, 2011 at 3:35 PM, Todd Lipcon <todd@cloudera.com>
wrote:
>>>>>>>>
>>>>>>>> Hi Steve,
>>>>>>>> Running setrep will indeed change those files. Changing "dfs.replication"
just changes the default replication value for files created in the future. Replication level
is a file-specific property.
>>>>>>>> Thanks
>>>>>>>> -Todd
>>>>>>>>
>>>>>>>> On Wed, May 18, 2011 at 3:32 PM, Steve Cohen <mail4steve@gmail.com>
wrote:
>>>>>>>>>
>>>>>>>>> Say I add a datanode to a pseudo cluster and I want to
change the
>>>>>>>>> replication factor to 2. I see that I can either run
hadoop fs -setrep
>>>>>>>>> or change the hdfs-site.xml value for dfs.replication.
But do either
>>>>>>>>> of these cause the existing blocks to replicate?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Steve Cohen
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Todd Lipcon
>>>>>>>> Software Engineer, Cloudera
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Todd Lipcon
>>>>>>> Software Engineer, Cloudera
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Joseph Echeverria
>>>>> Cloudera, Inc.
>>>>> 443.305.9434
>>>
>>>
>>
>
>
>
> --
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434
>

Mime
View raw message