hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naresh Rapolu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-1960) Master should wait for DFS to come up when creating hbase.version
Date Wed, 27 Apr 2011 20:38:10 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13025989#comment-13025989
] 

Naresh Rapolu commented on HBASE-1960:
--------------------------------------

Got this again with HBase-0.90.2 and Hadoop-0.20.2 append branch while launching cluster on
EC2 :
The master retries to create /hbase/hbase.version, but the namenode rejects saying 
{panel}
2011-04-27 20:26:15,004 WARN org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile:
failed to create file /hbase/hbase.version for DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105
on client 10.108.79.232 because current leaseholder is trying to recreate file.

2011-04-27 20:26:15,005 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50001,
call create(/hbase/hbase.version, rwxr-xr-x, DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105,
true, 3, 67108864) from 10.108.79.232:36701: error: org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException:
failed to create file /hbase/hbase.version for DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105
on client 10.108.79.232 because current leaseholder is trying to recreate file.
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/hbase.version
for DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 10.108.79.232
because current leaseholder is trying to recreate file.
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1182)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1054)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:1002)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.create(NameNode.java:381)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
{panel}

This sequence of events (retry by master and rejection by namenode) continue forever; master
is never started. 
The following is from HBase master logs:

{panel}
2011-04-27 20:12:44,760 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block null
bad datanode[0] nodes == null
2011-04-27 20:12:44,760 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase/hbase.version" - Aborting...
2011-04-27 20:12:44,762 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to create version
file at hdfs://ip-10-108-79-232.ec2.internal:50001/hbase, retrying: org.apache.hadoop.ipc.RemoteException:
java.io.IOException: File /hbase/hbase.version could only be replicated to 0 nodes, instead
of 1
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1363)
        at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:449)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
..........
.......... (While retrying)
..........

2011-04-27 20:28:15,044 WARN org.apache.hadoop.hbase.util.FSUtils: Unable to create version
file at hdfs://ip-10-108-79-232.ec2.internal:50001/hbase, retrying: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: failed to create file /hbase/hbase.version
for DFSClient_hb_m_ip-10-108-79-232.ec2.internal:60000_1303935162105 on client 10.108.79.232
because current leaseholder is trying to recreate file.
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:1182)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:1054)
{panel}

> Master should wait for DFS to come up when creating hbase.version
> -----------------------------------------------------------------
>
>                 Key: HBASE-1960
>                 URL: https://issues.apache.org/jira/browse/HBASE-1960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.90.2, 0.92.0
>
>         Attachments: HBASE-1960-redux.patch, HBASE-1960.patch
>
>
> The master does not wait for DFS to come up in the circumstance where the DFS master
is started for the first time after format and no datanodes have been started yet. 
> {noformat}
> 2009-11-07 11:47:28,115 INFO org.apache.hadoop.hbase.master.HMaster: vmName=Java HotSpot(TM)
64-Bit Server VM, vmVendor=Sun Microsystems Inc., vmVersion=14.2-b01
> 2009-11-07 11:47:28,116 INFO org.apache.hadoop.hbase.master.HMaster: vmInputArguments=[-Xmx1000m,
-XX:+HeapDumpOnOutOfMemoryError, -XX:+UseConcMarkSweepGC, -XX:+CMSIncrementalMode, -Dhbase.log.dir=/mnt/hbase/logs,
-Dhbase.log.file=hbase-root-master-ip-10-242-15-159.log, -Dhbase.home.dir=/usr/local/hbase-0.20.1/bin/..,
-Dhbase.id.str=root, -Dhbase.root.logger=INFO,DRFA, -Djava.library.path=/usr/local/hbase-0.20.1/bin/../lib/native/Linux-amd64-64]
> 2009-11-07 11:47:28,247 INFO org.apache.hadoop.hbase.master.HMaster: My address is ip-10-242-15-159.ec2.internal:60000
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could
only be replicated to 0 nodes, instead of 1
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> [...]
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
null bad datanode[0] nodes == null
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase/hbase.version" - Aborting...
> 2009-11-07 11:47:28,729 FATAL org.apache.hadoop.hbase.master.HMaster: Not starting HMaster
because:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version
could only be replicated to 0 nodes, instead of 1
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> {noformat}
> Should probably sleep and retry the write a few times.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message