Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C565A18FF2 for ; Mon, 6 Jul 2015 23:02:05 +0000 (UTC) Received: (qmail 44069 invoked by uid 500); 6 Jul 2015 23:02:04 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 43940 invoked by uid 500); 6 Jul 2015 23:02:04 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 43730 invoked by uid 99); 6 Jul 2015 23:02:04 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jul 2015 23:02:04 +0000 Date: Mon, 6 Jul 2015 23:02:04 +0000 (UTC) From: "stack (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HBASE-14028) DistributedLogReplay drops edits when ITBLL 125M MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 stack created HBASE-14028: ----------------------------- Summary: DistributedLogReplay drops edits when ITBLL 125M Key: HBASE-14028 URL: https://issues.apache.org/jira/browse/HBASE-14028 Project: HBase Issue Type: Bug Components: Recovery Affects Versions: 1.2.0 Reporter: stack Testing DLR before 1.2.0RC gets cut, we are dropping edits. Issue seems to be around replay into a deployed region that is on a server that dies before all edits have finished replaying. Logging is sparse on sequenceid accounting so can't tell for sure how it is happening (and if our now accounting by Store is messing up DLR). Digging. I notice also that DLR does not refresh its cache of region location on error -- it just keeps trying till whole WAL fails.... 8 retries...about 30 seconds. We could do a bit of refactor and have the replay find region in new location if moved during DLR replay. -- This message was sent by Atlassian JIRA (v6.3.4#6332)