hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1960) Master should wait for DFS to come up when creating hbase.version
Date Tue, 08 Mar 2011 19:38:59 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell updated HBASE-1960:
----------------------------------

    Attachment: HBASE-1960-redux.patch

DFS will immediately leave safe mode with 0 DNs when there are 0 blocks. This is inconvenient.
It's an edge case but happens for example when setting up EC2 clusters where up comes the
master instance running both NN and HMaster, and slaves with DNs and RegionServers come up
at some later time.

We used to handle this by checking the current DN count and waiting until it is nonzero. With
security, the check for datanode countdoesn't work -- it is a privileged op, we swallow the
IOE, and continue. Attached -redux patch removes the DN count check and instead adopts the
strategy of the jobtracker: we simply retry indefinitely the creation of hbase.version. This
will handle both the secure and nonsecure cases.

> Master should wait for DFS to come up when creating hbase.version
> -----------------------------------------------------------------
>
>                 Key: HBASE-1960
>                 URL: https://issues.apache.org/jira/browse/HBASE-1960
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Andrew Purtell
>            Assignee: Andrew Purtell
>            Priority: Minor
>             Fix For: 0.20.3, 0.90.0
>
>         Attachments: HBASE-1960-redux.patch, HBASE-1960.patch
>
>
> The master does not wait for DFS to come up in the circumstance where the DFS master
is started for the first time after format and no datanodes have been started yet. 
> {noformat}
> 2009-11-07 11:47:28,115 INFO org.apache.hadoop.hbase.master.HMaster: vmName=Java HotSpot(TM)
64-Bit Server VM, vmVendor=Sun Microsystems Inc., vmVersion=14.2-b01
> 2009-11-07 11:47:28,116 INFO org.apache.hadoop.hbase.master.HMaster: vmInputArguments=[-Xmx1000m,
-XX:+HeapDumpOnOutOfMemoryError, -XX:+UseConcMarkSweepGC, -XX:+CMSIncrementalMode, -Dhbase.log.dir=/mnt/hbase/logs,
-Dhbase.log.file=hbase-root-master-ip-10-242-15-159.log, -Dhbase.home.dir=/usr/local/hbase-0.20.1/bin/..,
-Dhbase.id.str=root, -Dhbase.root.logger=INFO,DRFA, -Djava.library.path=/usr/local/hbase-0.20.1/bin/../lib/native/Linux-amd64-64]
> 2009-11-07 11:47:28,247 INFO org.apache.hadoop.hbase.master.HMaster: My address is ip-10-242-15-159.ec2.internal:60000
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:
org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version could
only be replicated to 0 nodes, instead of 1
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> [...]
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Error Recovery for block
null bad datanode[0] nodes == null
> 2009-11-07 11:47:28,728 WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations.
Source file "/hbase/hbase.version" - Aborting...
> 2009-11-07 11:47:28,729 FATAL org.apache.hadoop.hbase.master.HMaster: Not starting HMaster
because:
> org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/hbase.version
could only be replicated to 0 nodes, instead of 1
> 	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1267)
> 	at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
> {noformat}
> Should probably sleep and retry the write a few times.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message