hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-21164) reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
Date Thu, 06 Sep 2018 22:55:00 GMT

     [ https://issues.apache.org/jira/browse/HBASE-21164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-21164:
--------------------------
    Description: 
RegionServers do reportForDuty on startup to tell Master they are available. If Master is
initializing, and especially on a big cluster when it can take a while particularly if something
is amiss, the log every three seconds is annoying and doesn't do anything of use. Do backoff
if fails up to a reasonable maximum period. Here is example:

{code}
2018-09-06 14:01:39,312 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
to master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, startcode=1536266763109
2018-09-06 14:01:39,312 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
....
{code}

For example, I am looking at a large cluster now that had a backlog of procedure WALs. It
is taking a couple of hours recreating the procedure-state because there are millions of procedures
outstanding. Meantime, the Master log is just full of the above message -- every three seconds...

  was:
RegionServers do reportForDuty on startup to tell Master they are available. If Master is
initializing, and especially on a big cluster when it can take a while particularly if something
is amiss, the log every three seconds is annoying and doesn't do anything of use. Do backoff
if fails up to a reasonable maximum period. Here is example:

{code}
2018-09-06 14:01:39,312 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
to master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, startcode=1536266763109
2018-09-06 14:01:39,312 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
....
{code}


> reportForDuty should do (expotential) backoff rather than retry every 3 seconds (default).
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-21164
>                 URL: https://issues.apache.org/jira/browse/HBASE-21164
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: stack
>            Assignee: Mingliang Liu
>            Priority: Minor
>         Attachments: HBASE-21164.branch-2.1.001.patch
>
>
> RegionServers do reportForDuty on startup to tell Master they are available. If Master
is initializing, and especially on a big cluster when it can take a while particularly if
something is amiss, the log every three seconds is annoying and doesn't do anything of use.
Do backoff if fails up to a reasonable maximum period. Here is example:
> {code}
> 2018-09-06 14:01:39,312 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
to master=vc0207.halxg.cloudera.com,22001,1536266763109 with port=22001, startcode=1536266763109
> 2018-09-06 14:01:39,312 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: reportForDuty
failed; sleeping and then retrying.
> ....
> {code}
> For example, I am looking at a large cluster now that had a backlog of procedure WALs.
It is taking a couple of hours recreating the procedure-state because there are millions of
procedures outstanding. Meantime, the Master log is just full of the above message -- every
three seconds...



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message