hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bertrand Dechoux <decho...@gmail.com>
Subject Re: Replace a block with a new one
Date Mon, 21 Jul 2014 09:19:01 GMT
You want to implement a RAID on top of HDFS or use HDFS on top of RAID? I
am not sure I understand any of these use cases. HDFS handles for you
replication and error detection. Fine tuning the cluster wouldn't be the
easier solution?

Bertrand Dechoux


On Mon, Jul 21, 2014 at 7:25 AM, Zesheng Wu <wuzesheng86@gmail.com> wrote:

> Thanks for reply, Arpit.
> Yes, we need to do this regularly. The original requirement of this is
> that we want to do RAID(which is based reed-solomon erasure codes) on our
> HDFS cluster. When a block is corrupted or missing, the downgrade read
> needs quick recovery of the block. We are considering how to recovery the
> corrupted/missing block quickly.
>
>
> 2014-07-19 5:18 GMT+08:00 Arpit Agarwal <aagarwal@hortonworks.com>:
>
>> IMHO this is a spectacularly bad idea. Is it a one off event? Why not
>> just take the perf hit and recreate the file?
>>
>> If you need to do this regularly you should consider a mutable file store
>> like HBase. If you start modifying blocks from under HDFS you open up all
>> sorts of consistency issues.
>>
>>
>>
>>
>> On Fri, Jul 18, 2014 at 2:10 PM, Shumin Guo <gsmsteve@gmail.com> wrote:
>>
>>> That will break the consistency of the file system, but it doesn't hurt
>>> to try.
>>>  On Jul 17, 2014 8:48 PM, "Zesheng Wu" <wuzesheng86@gmail.com> wrote:
>>>
>>>> How about write a new block with new checksum file, and replace the old
>>>> block file and checksum file both?
>>>>
>>>>
>>>> 2014-07-17 19:34 GMT+08:00 Wellington Chevreuil <
>>>> wellington.chevreuil@gmail.com>:
>>>>
>>>>> Hi,
>>>>>
>>>>> there's no way to do that, as HDFS does not provide file updates
>>>>> features. You'll need to write a new file with the changes.
>>>>>
>>>>> Notice that even if you manage to find the physical block replica
>>>>> files on the disk, corresponding to the part of the file you want to
>>>>> change, you can't simply update it manually, as this would give a different
>>>>> checksum, making HDFS mark such blocks as corrupt.
>>>>>
>>>>> Regards,
>>>>> Wellington.
>>>>>
>>>>>
>>>>>
>>>>> On 17 Jul 2014, at 10:50, Zesheng Wu <wuzesheng86@gmail.com> wrote:
>>>>>
>>>>> > Hi guys,
>>>>> >
>>>>> > I recently encounter a scenario which needs to replace an exist
>>>>> block with a newly written block
>>>>> > The most straightforward way to finish may be like this:
>>>>> > Suppose the original file is A, and we write a new file B which
is
>>>>> composed by the new data blocks, then we merge A and B to C which is
the
>>>>> file we wanted
>>>>> > The obvious shortcoming of this method is wasting of network
>>>>> bandwidth
>>>>> >
>>>>> > I'm wondering whether there is a way to replace the old block by
the
>>>>> new block directly.
>>>>> > Any thoughts?
>>>>> >
>>>>> > --
>>>>> > Best Wishes!
>>>>> >
>>>>> > Yours, Zesheng
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes!
>>>>
>>>> Yours, Zesheng
>>>>
>>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>
>
>
>
> --
> Best Wishes!
>
> Yours, Zesheng
>

Mime
View raw message