Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 8 Jan 2013 05:30:17 +0000 (UTC)
From: "chunhui shen (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12626327.1357554622722.95421.1357623017936@arcas>
In-Reply-To: <JIRA.12626327.1357554622722@arcas>
References: <JIRA.12626327.1357554622722@arcas>
Subject: [jira] [Commented] (HBASE-7507) Make memstore flush be able to
 retry after exception
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-7507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546622#comment-13546622 ] 

chunhui shen commented on HBASE-7507:
-------------------------------------

bq.Why moving the location of validateStoreFile() call ?
We can't do the retry in HStore#commitFile, but we could do the retry if failed validateStoreFile(), so move its location

[~stack]
bq.Should we open a new issue to retry all hdfs operations? 
We will do the hdfs operations for HFile and HLog, and we could tolerate IO errors in HLog now.
So I think retry for flush is enough since IO errors in compaction are nothing matter

For other comments, I will address in new patch

Thanks
                
> Make memstore flush be able to retry after exception
> ----------------------------------------------------
>
>                 Key: HBASE-7507
>                 URL: https://issues.apache.org/jira/browse/HBASE-7507
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.3
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: 7507-trunk v1.patch
>
>
> We will abort regionserver if memstore flush throws exception.
> I thinks we could do retry to make regionserver more stable because file system may be not ok in a transient time. e.g. Switching namenode in the NamenodeHA environment
> {code}
> HRegion#internalFlushcache(){
> ...
> try {
> ...
> }catch(Throwable t){
> DroppedSnapshotException dse = new DroppedSnapshotException("region: " +
>           Bytes.toStringBinary(getRegionName()));
> dse.initCause(t);
> throw dse;
> }
> ...
> }
> MemStoreFlusher#flushRegion(){
> ...
> region.flushcache();
> ...
>  try {
> }catch(DroppedSnapshotException ex){
> server.abort("Replay of HLog required. Forcing server shutdown", ex);
> }
> ...
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira