hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-507) In master, there are a load of places where no sleep between retries
Date Sat, 05 Apr 2008 02:05:24 GMT

    [ https://issues.apache.org/jira/browse/HBASE-507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12585868#action_12585868
] 

Jim Kellerman commented on HBASE-507:
-------------------------------------

> Bryan Duxbury - 04/Apr/08 04:41 PM
> RetryableOperation
>
>    * Needs a class comment.

Done.

>    * It's not a generically retryable operation, so the name is a tad misleading. Maybe
something like 
> RetryableMetaOperation?

Changed.

> TableOperation
>
>    * process() - if you need to tag the end of your loops with comments, you should consider
splitting
>  the method into a few different methods. How about everything inside the Retryable becomes
>  processSingleMetaRegion?

Well, those comments were from before when the retry logic was in place. It was not possible
to factor
it out into a separate method, but I did create a private internal class that cleaned it up
nicely.

> Nice to haves:
>
> ProcessRegionServerShutdown:
>
>    * process() is a very long method (100 lines). If it were broken up a little, might
make the indentation
>  in the anonymous Retryable classes a little less ugly.

Again, couldn't do it as a separate method, but did create two inner classes that cleans it
up.


> In master, there are a load of places where no sleep between retries
> --------------------------------------------------------------------
>
>                 Key: HBASE-507
>                 URL: https://issues.apache.org/jira/browse/HBASE-507
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.2.0
>            Reporter: stack
>            Assignee: Jim Kellerman
>             Fix For: 0.2.0
>
>         Attachments: 507-0.1.patch, 507-trunk.patch
>
>
> Here is an example:
> {code}
>  270308 2008-03-12 14:10:02,054 DEBUG org.apache.hadoop.hbase.HMaster: numberOfMetaRegions:
1, onlineMetaRegions.size(): 1                                                           
                                                                                 
> 270309 2008-03-12 14:10:02,054 DEBUG org.apache.hadoop.hbase.HMaster: process server
shutdown scanning .META.,,1 on XX.XX.XX.184:60020 HMaster                                
                                                                                       
> 270310 2008-03-12 14:10:02,056 DEBUG org.apache.hadoop.hbase.HMaster: process server
shutdown scanning .META.,,1 on XX.XX.XX.184:60020 HMaster                                
                                                                                       
> 270311 2008-03-12 14:10:02,057 DEBUG org.apache.hadoop.hbase.HMaster: process server
shutdown scanning .META.,,1 on XX.XX.XX.184:60020 HMaster                                
                                                                                       
> 270312 2008-03-12 14:10:02,059 DEBUG org.apache.hadoop.hbase.HMaster: process server
shutdown scanning .META.,,1 on XX.XX.XX.184:60020 HMaster
> 270313 2008-03-12 14:10:02,060 DEBUG org.apache.hadoop.hbase.HMaster: process server
shutdown scanning .META.,,1 on XX.XX.XX.184:60020 HMaster
> 270314 2008-03-12 14:10:02,062 WARN org.apache.hadoop.hbase.HMaster: Processing pending
operations: ProcessServerShutdown of XX.XX.XX.180:60020                                  
                                                                                    
> 270315 org.apache.hadoop.hbase.NotServingRegionException: org.apache.hadoop.hbase.NotServingRegionException
.META.,,1                                                                                
                                                                
> 270316         at org.apache.hadoop.hbase.HRegionServer.getRegion(HRegionServer.java:1606)

> ...
> {code}
> Whats actually going on here is 5 retries without a wait in between (logging should include
index numbering retry.  Seems to be a bunch of duplicated code around retrying that we might
be able to fix with a Callable.  Jim Firby today suggested we do expotential backoffs in our
retries. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message