hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jiang licht <licht_ji...@yahoo.com>
Subject Re: backup namenode setting issue: namenode failed to start
Date Tue, 22 Jun 2010 20:06:18 GMT
Thanks, Konstantin. I will look at options for mounting the folder. Is there any guide to a
successful deployment of this method? 

Still have this question, does this backup method only work for a fresh cluster (It's my guess
the namenode only stores a copy of new data information into folders specified in dfs.name.dir,
then this method only works for a fresh cluster. For cluster that already has data, the meta
data of the old data is not saved to the mounted folder. Is this correct?).

--Michael

--- On Mon, 6/21/10, Konstantin Shvachko <shv@yahoo-inc.com> wrote:

From: Konstantin Shvachko <shv@yahoo-inc.com>
Subject: Re: backup namenode setting issue: namenode failed to start
To: common-user@hadoop.apache.org
Date: Monday, June 21, 2010, 1:58 PM

Looks like the mounted file system /mnt/namenode-backup does not support locking.
It should, otherwise hdfs cannot guarantee that only one name-node updates the directory.
You might want to check with your sysadmins, may be the mount point is misconfigured.

Thanks,
--Konstantin

On 6/21/2010 10:43 AM, jiang licht wrote:
> According to hadoop tutorial on Yahoo developer netwrok and hadoop documentation on apache,
a simple way to achieve namenode backup and recovery from single point namenode failure is
to use a folder which is mounted on namenode machine but actually on a different machine to
save dfs meta data as well, in addition to the folder on the namenode, as follows:
>
> <property>
>      <name>dfs.name.dir</name>
>      <value>/home/hadoop/dfs/name,/mnt/namenode-backup</value>
>      <final>true</final>
>    </property>where /mnt/namenode-backup is mounted on the namenode machine
>
> I followed this approach. However, we did this not to a fresh cluster, instead, we have
run the cluster for a while, which means it has data already in hdfs.
>
> But
>   this method or my deployment failed and namenode simply failed to start. I did almost
the same: instead of mounting the namenode-backup under /mnt, I mount it under "/". The folder
"/namenode-backup" belongs to account "hadoop", under which the cluster is running. Thus there
is no access restriction issue.
>
> I got the following errors in the namenode log on the namenode machine:
>
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = namenodedomainname/#.#.#.#
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.2+228
> STARTUP_MSG:   build =  -r cfc3233ece0769b11af9add328261295aaf4d1ad; compiled by
'root' on Mon Mar 22 03:11:39 EDT 2010
> ************************************************************/
> 2010-06-14 16:46:53,879 INFO org.apache.hadoop.ipc.metrics.RpcMetrics: Initializing RPC
Metrics with hostName=NameNode, port=50001
> 2010-06-14 16:46:53,886 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Namenode
up at: namenodedomainname/#.#.#.#:50001
> 2010-06-14 16:46:53,888 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM
Metrics with processName=NameNode, sessionId=null
> 2010-06-14 16:46:53,889 INFO org.apache.hadoop.hdfs.server.namenode.metrics.NameNodeMetrics:
Initializing NameNodeMeterics using context object:org.apache.hadoop.metrics.spi.NullContext
> 2010-06-14 16:46:53,934 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=hadoop,hadoop
> 2010-06-14 16:46:53,934 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
> 2010-06-14 16:46:53,934 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
> 2010-06-14 16:46:53,940 INFO org.apache.hadoop.hdfs.server.namenode.metrics.FSNamesystemMetrics:
Initializing FSNamesystemMetrics using context object:org.apache.hadoop.metrics.spi.NullContext
> 2010-06-14 16:46:53,942 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered
FSNamesystemStatusMBean
> 2010-06-14 16:47:23,974 INFO org.apache.hadoop.hdfs.server.common.Storage: java.io.IOException:
No locks available
>          at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>          at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>          at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>          at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:285)
>          at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)
>
>          at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>          at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>          at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>          at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:285)
>          at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)
>
> 2010-06-14 16:47:23,976 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem
initialization failed.
> java.io.IOException: No locks available
>          at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>          at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>          at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>          at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:285)
>          at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)
> 2010-06-14 16:47:23,976 INFO org.apache.hadoop.ipc.Server: Stopping server on 50001
> 2010-06-14 16:47:23,977 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.io.IOException:
No locks available
>          at sun.nio.ch.FileChannelImpl.lock0(Native Method)
>          at sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:881)
>          at java.nio.channels.FileChannel.tryLock(FileChannel.java:962)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.tryLock(Storage.java:527)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:505)
>          at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:363)
>          at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:285)
>          at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:88)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:312)
>          at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:293)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:224)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:306)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1004)
>          at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1013)
>
> 2010-06-14 16:47:23,978 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
> /************************************************************
> SHUTDOWN_MSG: Shutting down NameNode at namenodedomainname/#.#.#.#
> ************************************************************/
>
> Thanks for your help!
>
> -Michael
>
>
>
>




      
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message