hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HBASE-1314) master sees HRS znode expire and splits log while the HRS is still running and accepting edits
Date Tue, 07 Apr 2009 04:56:13 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Purtell updated HBASE-1314:
----------------------------------

    Description: 
ZK session expiration related problem. HRS loses its ephemeral node while it is still up and
running and accepting edits. Master sees it go away and starts splitting its logs while edits
are still being written. After this, all reconstruction logs have to be manually removed from
the region directories or the regions will never deploy (CRC errors). I think on HDFS edits
would be lost, not corrupted. (I am using a HBase root on local file system.) 

HRS ZK session expires, causing its znode to go away:

2009-04-07 03:50:39,953 INFO org.apache.hadoop.hbase.master.ServerManager: localhost.localdomain_1239068648333_60020
znode expired
2009-04-07 03:50:40,565 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown
of localhost.localdomain_1239068648333_60020
2009-04-07 03:50:40,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process
shutdown of server localhost.localdomain_1239068648333_60020: logSplit: false, rootRescanned:
false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1

But here we have the HRS still reporting in, triggering a LeaseStillHeldException:

2009-04-07 03:50:40,826 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000,
call regionServerReport(address: 127.0.0.1:60020, startcode: 1239068648333, load: (requests=14,
regions=7, usedHeap=582, maxHeap=888), [Lorg.apache.hadoop.hbase.HMsg;@6da21389, [Lorg.apache.hadoop.hbase.HRegionInfo;@2bb0bf9a)
from 127.0.0.1:39238: error: org.apache.hadoop.hbase.Leases$LeaseStillHeldException
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:198)
        at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:601)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:909)

And log splitting starts anyway:

2009-04-07 03:50:41,139 INFO org.apache.hadoop.hbase.regionserver.HLog: Splitting 3 log(s)
in file:/data/hbase/log_localhost.localdomain_1239068648333_60020
2009-04-07 03:50:41,139 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Splitting 1 of 3:
file:/data/hbase/log_localhost.localdomain_1239068648333_60020/hlog.dat.1239075060711
[...]


  was:
ZK session expiration related problem. HRS loses its ephemeral node while it is still up and
running and accepting edits. Master sees it go away and starts splitting its logs. After this,
all reconstruction logs have to be manually removed from the region directories or the regions
will never deploy (CRC errors). I think on HDFS edits would be lost, not corrupted. (I am
using a HBase root on local file system.) 

HRS ZK session expires, causing its znode to go away:

2009-04-07 03:50:39,953 INFO org.apache.hadoop.hbase.master.ServerManager: localhost.localdomain_1239068648333_60020
znode expired
2009-04-07 03:50:40,565 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo: ProcessServerShutdown
of localhost.localdomain_1239068648333_60020
2009-04-07 03:50:40,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process
shutdown of server localhost.localdomain_1239068648333_60020: logSplit: false, rootRescanned:
false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1

But here we have the HRS still reporting in, triggering a LeaseStillHeldException:

2009-04-07 03:50:40,826 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2 on 60000,
call regionServerReport(address: 127.0.0.1:60020, startcode: 1239068648333, load: (requests=14,
regions=7, usedHeap=582, maxHeap=888), [Lorg.apache.hadoop.hbase.HMsg;@6da21389, [Lorg.apache.hadoop.hbase.HRegionInfo;@2bb0bf9a)
from 127.0.0.1:39238: error: org.apache.hadoop.hbase.Leases$LeaseStillHeldException
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:198)
        at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:601)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
        at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:909)

And log splitting starts anyway:

2009-04-07 03:50:41,139 INFO org.apache.hadoop.hbase.regionserver.HLog: Splitting 3 log(s)
in file:/data/hbase/log_localhost.localdomain_1239068648333_60020
2009-04-07 03:50:41,139 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Splitting 1 of 3:
file:/data/hbase/log_localhost.localdomain_1239068648333_60020/hlog.dat.1239075060711
[...]



> master sees HRS znode expire and splits log while the HRS is still running and accepting
edits
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1314
>                 URL: https://issues.apache.org/jira/browse/HBASE-1314
>             Project: Hadoop HBase
>          Issue Type: Bug
>    Affects Versions: 0.20.0
>            Reporter: Andrew Purtell
>
> ZK session expiration related problem. HRS loses its ephemeral node while it is still
up and running and accepting edits. Master sees it go away and starts splitting its logs while
edits are still being written. After this, all reconstruction logs have to be manually removed
from the region directories or the regions will never deploy (CRC errors). I think on HDFS
edits would be lost, not corrupted. (I am using a HBase root on local file system.) 
> HRS ZK session expires, causing its znode to go away:
> 2009-04-07 03:50:39,953 INFO org.apache.hadoop.hbase.master.ServerManager: localhost.localdomain_1239068648333_60020
znode expired
> 2009-04-07 03:50:40,565 DEBUG org.apache.hadoop.hbase.master.HMaster: Processing todo:
ProcessServerShutdown of localhost.localdomain_1239068648333_60020
> 2009-04-07 03:50:40,637 INFO org.apache.hadoop.hbase.master.RegionServerOperation: process
shutdown of server localhost.localdomain_1239068648333_60020: logSplit: false, rootRescanned:
false, numberOfMetaRegions: 1, onlineMetaRegions.size(): 1
> But here we have the HRS still reporting in, triggering a LeaseStillHeldException:
> 2009-04-07 03:50:40,826 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 2
on 60000, call regionServerReport(address: 127.0.0.1:60020, startcode: 1239068648333, load:
(requests=14, regions=7, usedHeap=582, maxHeap=888), [Lorg.apache.hadoop.hbase.HMsg;@6da21389,
[Lorg.apache.hadoop.hbase.HRegionInfo;@2bb0bf9a) from 127.0.0.1:39238: error: org.apache.hadoop.hbase.Leases$LeaseStillHeldException
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>         at org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:198)
>         at org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:601)
>         at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>         at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:909)
> And log splitting starts anyway:
> 2009-04-07 03:50:41,139 INFO org.apache.hadoop.hbase.regionserver.HLog: Splitting 3 log(s)
in file:/data/hbase/log_localhost.localdomain_1239068648333_60020
> 2009-04-07 03:50:41,139 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Splitting 1
of 3: file:/data/hbase/log_localhost.localdomain_1239068648333_60020/hlog.dat.1239075060711
> [...]

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message