hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4078) Silent Data Offlining During HDFS Flakiness
Date Tue, 30 Aug 2011 00:02:37 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093320#comment-13093320

Lars Hofhansl commented on HBASE-4078:

When does the corruption actually happen?

Does any of StoreFile.Writer.{append|appendMetadata|close}(...) silently fail, leaving a corrupt
file? If any of these throws any exception we would skip moving the file anyway.
If so, wouldn't it be better to fix that?

Or is this a problem deeper in HDFS?

> Silent Data Offlining During HDFS Flakiness
> -------------------------------------------
>                 Key: HBASE-4078
>                 URL: https://issues.apache.org/jira/browse/HBASE-4078
>             Project: HBase
>          Issue Type: Bug
>          Components: io, regionserver
>    Affects Versions: 0.89.20100924, 0.90.3, 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Pritam Damania
>            Priority: Blocker
>         Attachments: 0001-Validate-store-files-after-compactions-flushes.patch, 0001-Validate-store-files.patch
> See HBASE-1436 .  The bug fix for this JIRA is a temporary workaround for improperly
moving partially-written files from TMP into the region directory when a FS error occurs.
 Unfortunately, the fix is to ignore all IO exceptions, which masks off-lining due to FS flakiness.
 We need to permanently fix the problem that created HBASE-1436 & then at least have the
option to not open a region during times of flakey FS.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message