hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hadoop QA (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-9286) ageOfLastShippedOp replication metric doesn't update if the slave regionserver is stalled
Date Fri, 23 Aug 2013 17:49:51 GMT

    [ https://issues.apache.org/jira/browse/HBASE-9286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13748765#comment-13748765

Hadoop QA commented on HBASE-9286:

{color:red}-1 overall{color}.  Here are the results of testing the latest attachment 
  against trunk revision .

    {color:green}+1 @author{color}.  The patch does not contain any @author tags.

    {color:red}-1 tests included{color}.  The patch doesn't appear to include any new or modified
                        Please justify why no new tests are needed for this patch.
                        Also please list what manual steps were performed to verify this patch.

    {color:red}-1 patch{color}.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/6861//console

This message is automatically generated.
> ageOfLastShippedOp replication metric doesn't update if the slave regionserver is stalled
> -----------------------------------------------------------------------------------------
>                 Key: HBASE-9286
>                 URL: https://issues.apache.org/jira/browse/HBASE-9286
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Alex Newman
>            Assignee: Alex Newman
>         Attachments: 0001-HBASE-9286.-ageOfLastShippedOp-replication-metric-do.patch
> In replicationmanager
>      HRegionInterface rrs = getRS();
>         rrs.replicateLogEntries(Arrays.copyOf(this.entriesArray, currentNbEntries));
> ....
>         this.metrics.setAgeOfLastShippedOp(
>             this.entriesArray[currentNbEntries-1].getKey().getWriteTime());
>         break;
> which makes sense, but is wrong. The problem is that rrs.replicateLogEntries will block
for a very long time if the slave server is suspended or unavailable but not down.
> However this is easy to fix. We just need to call       refreshAgeOfLastShippedOp();
> on a regular basis, in a different thread. I've attached a patch which fixed this for
cdh4. I can make one for trunk and the like as well if you need me to do but it's a small

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message