hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Izaak Rubin (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-727) Client caught in an infinite loop when trying to connect to cached server locations
Date Mon, 07 Jul 2008 23:00:31 GMT

    [ https://issues.apache.org/jira/browse/HBASE-727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12611389#action_12611389
] 

Izaak Rubin commented on HBASE-727:
-----------------------------------

Here's some of a log file that demonstrates the problem:

{code}
2008-07-03 15:28:03,890 INFO org.apache.hadoop.hbase.master.ServerManager: Received MSG_REPORT_OPEN:
-ROOT-,,0 from 127.0.0.1:56998
2008-07-03 15:28:03,891 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 1,
Num Servers: 1, Avg Load: 1.0
2008-07-03 15:28:03,892 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner
scanning meta region {regionname: -ROOT-,,0, startKey: <>, server: 127.0.0.1:56998}
2008-07-03 15:28:03,951 DEBUG org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScannerREGION
=> {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,
TABLE => {NAME => '.META.', FAMILIES => [{NAME => 'historian', VERSIONS =>
2147483647, COMPRESSION => 'NONE', IN_MEMORY => false, BLOCKCACHE => false, LENGTH
=> 2147483647, TTL => FOREVER, BLOOMFILTER => NONE}, {NAME => 'info', VERSIONS
=> 1, COMPRESSION => 'NONE', IN_MEMORY => false, BLOCKCACHE => false, LENGTH =>
2147483647, TTL => FOREVER, BLOOMFILTER => NONE}]}}, SERVER => '127.0.0.1:56544',
STARTCODE => 1215123936723
2008-07-03 15:28:03,951 DEBUG org.apache.hadoop.hbase.master.BaseScanner: Current assignment
of .META.,,1 is not valid: serverInfo: null, passed startCode: 1215123936723, storedInfo.startCode:
-1, unassignedRegions: false, pendingRegions: false
2008-07-03 15:28:03,953 INFO org.apache.hadoop.hbase.master.BaseScanner: RegionManager.rootScanner
scan of meta region {regionname: -ROOT-,,0, startKey: <>, server: 127.0.0.1:56998} complete
2008-07-03 15:28:04,815 DEBUG org.apache.hadoop.hbase.master.ServerManager: Total Load: 1,
Num Servers: 1, Avg Load: 1.0
2008-07-03 15:28:04,821 DEBUG org.apache.hadoop.hbase.client.HConnectionManager$TableServers:
Found ROOT REGION => {NAME => '-ROOT-,,0', STARTKEY => '', ENDKEY => '', ENCODED
=> 70236052, TABLE => {NAME => '-ROOT-', FAMILIES => [{NAME => 'info', VERSIONS
=> 1, COMPRESSION => 'NONE', IN_MEMORY => false, BLOCKCACHE => false, LENGTH =>
2147483647, TTL => FOREVER, BLOOMFILTER => NONE}]}
2008-07-03 15:28:04,834 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:56544.
Already tried 1 time(s).
2008-07-03 15:28:05,834 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:56544.
Already tried 2 time(s).
2008-07-03 15:28:06,835 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:56544.
Already tried 3 time(s).
2008-07-03 15:28:07,836 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: /127.0.0.1:56544.
Already tried 4 time(s).
{code}

For reference, 127.0.0.1:56544 was the server being used before the restart, and port 56998
is the one being used after the restart.  The retry messages continue infinitely (only the
first 4 are shown above).  I'll attach a text file with more of the surrounding log.

> Client caught in an infinite loop when trying to connect to cached server locations
> -----------------------------------------------------------------------------------
>
>                 Key: HBASE-727
>                 URL: https://issues.apache.org/jira/browse/HBASE-727
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: client, ipc
>            Reporter: Izaak Rubin
>            Assignee: Izaak Rubin
>            Priority: Minor
>         Attachments: hbase-727_logfile_sample.txt
>
>
> HbaseRPC, which (to my understanding) is used whenever there is a need to connect to
a server, enters an infinite loop to continually retry the connection until it succeeds. 
This makes sense for server-to-server interaction, but it doesn't necessarily make sense for
all client-to-server interaction.
> The problem I first observed was in doing fast restarts of HBase.  When I attempted to
reload the UI after a restart, it would infinitely try to re-contact the cached server location
from before the restart.  The correct behavior would be to break out of the loop as soon as
possible in situations like the one above.  I think that throwing a RetriesExhaustedException
would be the best way to do this, although if anyone has any suggestions please let me know.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message