Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
Date: Sat, 19 Jul 2014 00:30:40 +0000 (UTC)
From: "Andrew Purtell (JIRA)" <jira@apache.org>
To: dev@hbase.apache.org
Message-ID: <JIRA.12491560.1291050301778.841.1405729840741@arcas>
In-Reply-To: <JIRA.12491560.1291050301778@arcas>
References: <JIRA.12491560.1291050301778@arcas>
Subject: [jira] [Resolved] (HBASE-3281) During log replay on region open,
 hbase.hstore.report.interval.edits setting may be adding non-negligible
 overhead to log replay
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HBASE-3281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Purtell resolved HBASE-3281.
-----------------------------------

    Resolution: Cannot Reproduce

Reopen or file a new issue if still relevant with modern HBase versions

> During log replay on region open, hbase.hstore.report.interval.edits setting may be adding non-negligible overhead to log replay
> --------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-3281
>                 URL: https://issues.apache.org/jira/browse/HBASE-3281
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.0
>            Reporter: Jonathan Gray
>
> On cluster here, I see a log replay on a region taking about 28 seconds.  It does a replay of approximately 750,000 edits.  Since this can run for a while, we have a Progress By default we have:
> {noformat}
>       int interval = this.conf.getInt("hbase.hstore.report.interval.edits", 2000);
> {noformat}
> This led to about 300 ZK node re-transitions (from OPENING to OPENING) in about 30 seconds.  I haven't measured the operation in ZK but it's certainly several millis.
> Seems like we could be adding a significant amount of overheard here (5ms * 300 = 1.5 seconds = 5%).  But I think some of these could be >5ms so we could be adding 10% or more.
> One way to address this would be to do it based on size not entries (this region only had increments, so lots of small edits).  Another way would be to do it based on time instead of entries (check-in every 5 seconds, for example).


--
This message was sent by Atlassian JIRA
(v6.2#6252)