hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-644) DroppedSnapshotException but RegionServer doesn't restart
Date Sun, 25 May 2008 20:41:55 GMT

     [ https://issues.apache.org/jira/browse/HBASE-644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-644:
------------------------

    Status: Patch Available  (was: Open)

Looking for review...

> DroppedSnapshotException but RegionServer doesn't restart
> ---------------------------------------------------------
>
>                 Key: HBASE-644
>                 URL: https://issues.apache.org/jira/browse/HBASE-644
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Priority: Blocker
>             Fix For: 0.1.3, 0.2.0
>
>         Attachments: 644-0.1-v1.patch
>
>
> RegionServer was carrying -ROOT- and having trouble writing HDFS.  RegionServer judged
that a flush failed and reported a DroppedSnapshotException.  Usually, the filesystem check
would fail and set all the abort flags but it in this case filesystem somehow returned healthy
and the flags were not set.  The code path shutdown the RPC only and exited then we exited
the Flusher.  All the rest of the RegionServer stayed up and kept reporting the master.  The
master thought it alive and kept trying to scan the unreachable -ROOT-.  Cluster was hosed
until manual intervention 20 minutes later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message