hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Gray (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-3168) Sanity date and time check when a region server joins the cluster
Date Sat, 06 Nov 2010 23:04:21 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12929267#action_12929267
] 

Jonathan Gray commented on HBASE-3168:
--------------------------------------

@Jeff, so looks like just adding the timestamp as an option to regionServerStartup should
work for this jira.  You want to finish up this jira (RC being cut this week)?  I can do it
if not.

@Stack, I think you mixed one of your startup/reports but if you're saying do away with the
startup message and just use the report, that sounds good (since we already support that),
but we can eventually get rid of the report as well.  I think we're just relying on load and
split stuff for report now?  And some RS discovery?  Maybe not in 0.92 but when we do more
with load then it's something to consider.  All discovery via ZK nodes then.

Anyways, I'll stop hijacking the jira.  Let's just put a timestamp into the startup method.

> Sanity date and time check when a region server joins the cluster
> -----------------------------------------------------------------
>
>                 Key: HBASE-3168
>                 URL: https://issues.apache.org/jira/browse/HBASE-3168
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.89.20100924
>         Environment: RHEL 5.5 64bit, 1 Master 4 Region Servers
>            Reporter: Jeff Whiting
>             Fix For: 0.90.0
>
>         Attachments: HBASE-3168-trunk-v1.txt
>
>
> Introduce a sanity check when a RS joins the cluster to make sure its clock isn't too
far out of skew with the rest of the cluster.  If the RS's time is too far out of skew then
the master would prevent it from joining and RS would die and log the error. 
> Having a RS with even small differences in time can cause huge problems due to how bhase
stores values with timestamps.
> According to J-D in ServerManager we are already doing: 
> {code}
>     HServerInfo info = new HServerInfo(serverInfo);
>     checkIsDead(info.getServerName(), "STARTUP");
>     checkAlreadySameHostPort(info);
>     recordNewServer(info, false, null);
> {code}
> And that the new check would fit in nicely there.
> JG suggests we add a "ClockOutOfSync-like exception"

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message