hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7507) Make memstore flush be able to retry after exception
Date Tue, 26 Feb 2013 02:06:14 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586627#comment-13586627
] 

chunhui shen commented on HBASE-7507:
-------------------------------------

bq.I wonder what happens retrying after an IOE. Is it ok retrying any IOE? Has the flush path
been reviewed to make sure only IOE is failed NN op? Is it possible to get an IOE after the
file has been successfully written?

Before flushed file committed, we could always retry flushing. 

Flush path is always unique by StoreFile#getUniqueFile.

Store#validateStoreFile may throw exception after successfully written when switching namenode
in the NamenodeHA environment. We have encountered this case.
                
> Make memstore flush be able to retry after exception
> ----------------------------------------------------
>
>                 Key: HBASE-7507
>                 URL: https://issues.apache.org/jira/browse/HBASE-7507
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.6
>
>         Attachments: 7507-94.patch, 7507-trunk v1.patch, 7507-trunk v2.patch, 7507-trunkv3.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file system may be
not ok in a transient time. e.g. Switching namenode in the NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>           Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message