hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (HBASE-2108) [HA] hbase cluster should be able to ride over hdfs 'safe mode' flip and namenode restart/move
Date Thu, 04 Feb 2010 03:24:28 GMT

     [ https://issues.apache.org/jira/browse/HBASE-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell reassigned HBASE-2108:
-------------------------------------

    Assignee: Andrew Purtell

> [HA] hbase cluster should be able to ride over hdfs 'safe mode' flip and namenode restart/move
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-2108
>                 URL: https://issues.apache.org/jira/browse/HBASE-2108
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: Andrew Purtell
>             Fix For: 0.21.0
>
>
> Todd Lipcon wrote up the following speculation on what happens when NN is restarted/goes
away/replaced by backup under hbase (see Dhruba's note here, http://hadoopblog.blogspot.com/2009/11/hdfs-high-availability.html,
that Eli pointed us at for some background on the 0.21 BackupNode feature):
> "For regions that are already open, HBase can continue to serve reads so long as the
regionservers are up and do not change state. This is because the HDFS client APIs cache the
DFS block locations (a map of block ID -> datanode addresses) for open files.
> "If any HBase action occurs that causes the regionservers to reopen a region (eg a region
server fails, load balancing rebalances the region assignment, or a compaction finishes) then
the reopen will fail as the new file will not be able to access the NameNode to receive the
block locations. As these are all periodic operations for HBase, it's impossible to put a
specific bound on this time, but my guess is that at least one region server is likely to
crash within less than a minute of a NameNode unavailability.
> "Similar properties hold for writes. HBase's writing behavior is limited to Commit Logs
which are kept open by the region servers. Writes to commit logs that are already open will
continue to succeed, since they only involve the datanodes, but if a region server rolls an
edit log, the open() for the new log will fail if the NN is unavailable. There is currently
some work going on in HBase trunk to preallocate open files for commit logs to avoid this
issue, but it is not complete, and it is not a full solution for the issue. The other issue
is that the close() call that completes the write of a commit log also depends on a functioning
NameNode - if it is unavailable, the log will be left in an indeterminate state and the edits
may become lost when the NN recovers.
> "The rolling of commit logs is triggered either when a timer elapses or when a certain
amount of data has been written. Thus, this failure mode will trigger quickly when data is
constantly being written to the cluster. If little data is being written, it still may trigger
due to the automatic periodic log rolling.
> "Given these above failure modes, I don't believe there is an effective HA solution for
HBase at this point. Although HBase may continue to operate for a short time period while
a NN recovers, it is also possible that it will fail nearly immediately, depending on when
HBase's periodic operations happen to occur. Even with an automatic failover like DRBD+Heartbeat
on the NN, the downtime may last 5-10 minutes as the new NN must both replay the edit log
and receive block reports from every datanode before it can exit safe mode. I believe this
will cause most NN failovers to be accompanied by a partial or complete failure of the HBase
cluster."
> The above makes sense to me.  Lets fix.  Generally our mode up to this has been that
if hdfs goes away, we've dealt with it on a regionserver by regionserver basis shutting itself
down to protect against dataloss.    We need to handle riding over NN restart/change of server.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message