Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 7 Jul 2015 00:35:04 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12843020.1436223710000.112876.1436229304852@Atlassian.JIRA>
In-Reply-To: <JIRA.12843020.1436223710000@Atlassian.JIRA>
References: <JIRA.12843020.1436223710000@Atlassian.JIRA>
 <JIRA.12843020.1436223710772@arcas>
Subject: [jira] [Commented] (HBASE-14028) DistributedLogReplay drops edits
 when ITBLL 125M
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-14028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615977#comment-14615977 ] 

stack commented on HBASE-14028:
-------------------------------

bq. This -recovery-from-failure-during-recovery-from-failure thing looks quite complicated to me. 

Yes. It should work. All the pieces are there.  Smile.  I've done a few more runs and it passes sometimes.  Let me try and figure the hole.

> DistributedLogReplay drops edits when ITBLL 125M
> ------------------------------------------------
>
>                 Key: HBASE-14028
>                 URL: https://issues.apache.org/jira/browse/HBASE-14028
>             Project: HBase
>          Issue Type: Bug
>          Components: Recovery
>    Affects Versions: 1.2.0
>            Reporter: stack
>
> Testing DLR before 1.2.0RC gets cut, we are dropping edits.
> Issue seems to be around replay into a deployed region that is on a server that dies before all edits have finished replaying. Logging is sparse on sequenceid accounting so can't tell for sure how it is happening (and if our now accounting by Store is messing up DLR). Digging.
> I notice also that DLR does not refresh its cache of region location on error -- it just keeps trying till whole WAL fails.... 8 retries...about 30 seconds. We could do a bit of refactor and have the replay find region in new location if moved during DLR replay.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)