hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: CheckPoint Node
Date Thu, 22 Nov 2012 17:45:14 GMT
Please follow the tips provided at
http://wiki.apache.org/hadoop/FAQ#How_do_I_set_up_a_hadoop_node_to_use_multiple_volumes.3Fand
http://wiki.apache.org/hadoop/FAQ#If_the_NameNode_loses_its_only_copy_of_the_fsimage_file.2C_can_the_file_system_be_recovered_from_the_DataNodes.3F

In short, if you use a non-HA NameNode setup:

- Yes the NN is a very vital persistence point in running HDFS and its
data should be redundantly stored for safety.
- You should, in production, configure your NameNode's image and edits
disk (dfs.name.dir in 1.x+, or dfs.namenode.name.dir in 0.23+/2.x+) to
be a dedicated one with adequate free space for gradual growth, and
should configure multiple disks (with one off-machine NFS point highly
recommended for easy recovery) for adequate redundancy.

If you instead use a HA NameNode setup (I'd highly recommend doing
this since it is now available), the presence of > 1 NameNodes and the
journal log mount or quorum setup would automatically act as
safeguards for the FS metadata.

On Thu, Nov 22, 2012 at 11:03 PM, Jean-Marc Spaggiari
<jean-marc@spaggiari.org> wrote:
> Hi Harsh,
>
> Thanks for pointing me to this link. I will take a close look at it.
>
> So with 1.x and 0.23.x, what's the impact on the data if the namenode
> server hard-drive die? Is there any critical data stored locally? Or I
> simply need to build a new namenode, start it and restart all my
> namenodes to find my data back?
>
> I can deal with my application not beeing available, but loosing data
> can be a bigger issue.
>
> Thanks,
>
> JM
>
> 2012/11/22, Harsh J <harsh@cloudera.com>:
>> Hey Jean,
>>
>> The 1.x, 0.23.x release lines both don't have NameNode HA features.
>> The current 2.x releases carry HA-NN abilities, and this is documented
>> at
>> http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailability.html.
>>
>> On Thu, Nov 22, 2012 at 10:18 PM, Jean-Marc Spaggiari
>> <jean-marc@spaggiari.org> wrote:
>>> Replying to myself ;)
>>>
>>> By digging a bit more I figured that 1.0 version is older than 0.23.4
>>> version and that backupnodes are on 0.23.4. Secondarynamenodes on 1.0
>>> are now deprecated.
>>>
>>> I'm still a bit mixed up on the way to achieve HA for the namenode
>>> (1.0 or 0.23.4) but I will continue to dig over internet.
>>>
>>> JM
>>>
>>> 2012/11/22, Jean-Marc Spaggiari <jean-marc@spaggiari.org>:
>>>> Hi,
>>>>
>>>> I'm reading a bit about hadoop and I'm trying to increase the HA of my
>>>> current cluster.
>>>>
>>>> Today I have 8 datanodes and one namenode.
>>>>
>>>> By reading here: http://www.aosabook.org/en/hdfs.html I can see that a
>>>> Checkpoint node might be a good idea.
>>>>
>>>> So I'm trying to start a checkpoint node. I looked at the hadoop
>>>> online doc. There is a link toe describe the command usage "For
>>>> command usage, see namenode." but this link is not working. Also, if I
>>>> try hadoop-deamon.sh start namenode -checkpoint as discribed in the
>>>> documentation, it's not starting.
>>>>
>>>> So I'n wondering, is there anywhere where I can find up to date
>>>> documentation about the checkpoint node? I will most probably try the
>>>> BackupNode.
>>>>
>>>> I'm using hadoop 1.0.3. The options I have to start on this version
>>>> are namenode, secondarynamenode, datanode, dfsadmin, mradmin, fsck and
>>>> fs. Should I start some secondarynamenodes instead of backupnode and
>>>> checkpointnode?
>>>>
>>>> Thanks,
>>>>
>>>> JM
>>>>
>>
>>
>>
>> --
>> Harsh J
>>



-- 
Harsh J

Mime
View raw message