hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pete Wyckoff <pwyck...@facebook.com>
Subject Re: Thinking about retriving DFS metadata from datanodes!!!
Date Thu, 11 Sep 2008 18:24:51 GMT

You may want to look at hadoop's proposal for snapshotting, where one can
take a snapshot's metadata and store it in some disaster resilient place(s)
for a rainy day:

https://issues.apache.org/jira/browse/HADOOP-3637





On 9/11/08 10:06 AM, "Dhruba Borthakur" <dhruba@gmail.com> wrote:

> My opinion is to not store file-namespace related metadata on the
> datanodes. When a file is renamed, one has to contact all datanodes to
> change this new metadata. Worse still, if one renames an entire
> subdirectory, all blocks that belongs to all files in the subdirectory
> have to be updated. Similarly, if in future,  a file has multiple
> patches to it (links), a block may belong to two filenames.
> 
> In the future, if HDFS wants to implement any kind of de-duplication
> (i.e. if the same block data appears in multiple files, the file
> system can intelligently keep only one copy of the block).. it will be
> difficult to do.
> 
> thanks,
> dhruba
> 
> 
> 
> On Wed, Sep 10, 2008 at 7:40 PM, 叶双明 <yeshuangming@gmail.com> wrote:
>> Thanks Ari Rabkin!
>> 
>> 1. I think the cost is very low, if the block's size is 10m, 1k/10m almost
>> 0.01% of the disk space.
>> 
>> 2. Actually, if two of racks lose and replication <= 3, it seem that we
>> can't recover all data. But in the situation of losing one rack of two racks
>> and replication >=2, we can recover all data.
>> 
>> 3. Suppose we recover 87.5% of data. I am not sure whether or not the random
>> 87.5% of the data is usefull for every user. But in the situation of the
>> size of most file is less than block'size, we can recover  so much data,.Any
>> recovered data may be  valuable for some user.
>> 
>> 4. I guess most small companies or organizations just have a cluster with
>> 10-100 nodes, and they can not afford a second HDFS cluster in a different
>> place or SAN. And it is a simple way to I think they would be pleased to
>> ensure data safety for they.
>> 
>> 5. We can config to turn on when someone need it, or turn it off otherwise.
>> 
>> Glad to discuss with you!
>> 
>> 
>> 2008/9/11 Ariel Rabkin <asrabkin@gmail.com>
>> 
>>> I don't understand this use case.
>>> 
>>> Suppose that you lose half the nodes in the cluster.  On average,
>>> 12.5% of your blocks were exclusively stored on the half the cluster
>>> that's dead.  For many (most?) applications, a random 87.5% of the
>>> data isn't really useful.  Storing metadata in more places would let
>>> you turn a dead cluster into a corrupt cluster, but not into a working
>>> one.   If you need to survive major disasters, you want a second HDFS
>>> cluster in a different place.
>>> 
>>> The thing that might be useful to you, if you're worried about
>>> simultaneous namenode and secondary NN failure, is to store the edit
>>> log and fsimage on a SAN, and get fault tolerance that way.
>>> 
>>> --Ari
>>> 
>>> On Tue, Sep 9, 2008 at 6:38 PM, 叶双明 <yeshuangming@gmail.com> wrote:
>>>> Thanks for paying attention  to my tentative idea!
>>>> 
>>>> What I thought isn't how to store the meradata, but the final (or last)
>>> way
>>>> to recover valuable data in the cluster when something worst (which
>>> destroy
>>>> the metadata in all multiple NameNode) happen. i.e. terrorist attack  or
>>>> natural disasters destroy half of cluster nodes within all NameNode, we
>>> can
>>>> recover as much data as possible by this mechanism, and hava big chance
>>> to
>>>> recover entire data of cluster because fo original replication.
>>>> 
>>>> Any suggestion is appreciate!
>>>> 
>>>> 2008/9/10 Pete Wyckoff <pwyckoff@facebook.com>
>>>> 
>>>>> +1 -
>>>>> 
>>>>> from the perspective of the data nodes, dfs is just a block-level store
>>> and
>>>>> is thus much more robust and scalable.
>>>>> 
>>>>> 
>>>>> 
>>>>> On 9/9/08 9:14 AM, "Owen O'Malley" <omalley@apache.org> wrote:
>>>>> 
>>>>>> This isn't a very stable direction. You really don't want multiple
>>>>> distinct
>>>>>> methods for storing the metadata, because discrepancies are very
bad.
>>>>> High
>>>>>> Availability (HA) is a very important medium term goal for HDFS,
but
>>> it
>>>>> will
>>>>>> likely be done using multiple NameNodes and ZooKeeper.
>>>>>> 
>>>>>> -- Owen
>>>>> 
>>> 
>>> --
>>> Ari Rabkin asrabkin@gmail.com
>>> UC Berkeley Computer Science Department
>>> 
>> 
>> 
>> 
>> --
>> Sorry for my english!!  明
>> Please help me to correct my english expression and error in syntax
>> 


Mime
View raw message