hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Kellerman (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HBASE-1052) Stopping a HRegionServer with unflushed cache causes data loss from org.apache.hadoop.hbase.DroppedSnapshotException
Date Tue, 09 Dec 2008 19:21:44 GMT

     [ https://issues.apache.org/jira/browse/HBASE-1052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jim Kellerman resolved HBASE-1052.
----------------------------------

       Resolution: Fixed
    Fix Version/s:     (was: 0.18.2)
                   0.19.0
         Assignee: Jim Kellerman

This cannot be fixed in the hbase 0.18.x branch because it depends on features from hadoop-0.19.
It has been fixed in trunk. See HBASE-728

> Stopping a HRegionServer with unflushed cache causes data loss from org.apache.hadoop.hbase.DroppedSnapshotException

> ---------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-1052
>                 URL: https://issues.apache.org/jira/browse/HBASE-1052
>             Project: Hadoop HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.18.0, 0.18.1
>            Reporter: Cosmin Lehene
>            Assignee: Jim Kellerman
>            Priority: Critical
>             Fix For: 0.19.0
>
>
>  1.  Start a Hbase cluster
>  2.  Create a table t1: create 't1', {NAME => 'f1'}
>  3.  Put a cell in the table: put 't1', 'r1', 'f1:', 'value'
>  4.  Scan it, see it's fine
>  5.  Stop the HRegionSever hosting the t1 region: hbase/bin/hbase-daemon.sh stop regionserver.
>  6.  Watch the region being reassigned from the original HRegionServer
>  7.  Scan the t1 table again. It's empty now.
> If between step 4 and step 5 the cache is flushed (e.g. Hbase cluster restart) no data
is loss. However it means that if you stop a region server with dirty cache you will loose
some data. 
> HRegionServer log after issuing hbase-daemon.sh stop regionserver:
> 2008-12-09 06:37:46,873 INFO org.apache.hadoop.ipc.Server: IPC Server handler 7 on 60020:
exiting
> 2008-12-09 06:37:46,873 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020:
exiting
> 2008-12-09 06:37:46,873 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 60020:
exiting
> 2008-12-09 06:37:46,873 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 60020:
exiting
> 2008-12-09 06:37:46,874 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping
infoServer
> 2008-12-09 06:37:46,874 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
> 2008-12-09 06:37:46,874 INFO org.mortbay.util.ThreadedServer: Stopping Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=60030]
> 2008-12-09 06:37:46,886 INFO org.mortbay.http.SocketListener: Stopped SocketListener
on 0.0.0.0:60030
> 2008-12-09 06:37:46,948 INFO org.mortbay.util.Container: Stopped HttpContext[/static,/static]
> 2008-12-09 06:37:47,007 INFO org.mortbay.util.Container: Stopped HttpContext[/logs,/logs]
> 2008-12-09 06:37:47,007 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.servlet.WebApplicationHandler@60ded0f0
> 2008-12-09 06:37:47,094 INFO org.mortbay.util.Container: Stopped WebApplicationContext[/,/]
> 2008-12-09 06:37:47,094 INFO org.mortbay.util.Container: Stopped org.mortbay.jetty.Server@6490832e
> 2008-12-09 06:37:47,094 DEBUG org.apache.hadoop.hbase.regionserver.HRegionServer: closing
region t1,,1228833363456
> 2008-12-09 06:37:47,094 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Compactions
and cache flushes disabled for region t1,,1228833363456
> 2008-12-09 06:37:47,094 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Scanners
disabled for region t1,,1228833363456
> 2008-12-09 06:37:47,094 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more active
scanners for region t1,,1228833363456
> 2008-12-09 06:37:47,095 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled
for region t1,,1228833363456
> 2008-12-09 06:37:47,095 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: No more row
locks outstanding on region t1,,1228833363456
> 2008-12-09 06:37:47,095 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache
flush for region t1,,1228833363456. Current region memcache size 18.0
> 2008-12-09 06:37:47,095 INFO org.apache.hadoop.hbase.regionserver.Flusher: regionserver/0:0:0:0:0:0:0:0:60020.cacheFlusher
exiting
> 2008-12-09 06:37:47,096 INFO org.apache.hadoop.hbase.regionserver.LogRoller: LogRoller
exiting.
> 2008-12-09 06:37:47,096 INFO org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver/0:0:0:0:0:0:0:0:60020.compactor exiting
> 2008-12-09 06:37:47,099 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: error
closing region t1,,1228833363456
> org.apache.hadoop.hbase.DroppedSnapshotException: region: t1,,1228833363456
>     at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1071)
>     at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:619)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.closeAllRegions(HRegionServer.java:951)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:459)
>     at java.lang.Thread.run(Thread.java:619)
> Caused by: java.io.IOException: Filesystem closed
>     at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:196)
>     at org.apache.hadoop.dfs.DFSClient.getFileInfo(DFSClient.java:564)
>     at org.apache.hadoop.dfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:390)
>     at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:667)
>     at org.apache.hadoop.hbase.regionserver.HStoreFile.<init>(HStoreFile.java:152)
>     at org.apache.hadoop.hbase.regionserver.HStore.internalFlushCache(HStore.java:599)
>     at org.apache.hadoop.hbase.regionserver.HStore.flushCache(HStore.java:577)
>     at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:1058)
>     ... 4 more
> 2008-12-09 06:37:47,100 DEBUG org.apache.hadoop.hbase.regionserver.HLog: closing log
writer in hdfs://h1:54310/hbase/log_10.131.237.51_1228833326838_60020
> 2008-12-09 06:37:47,101 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: Close
and delete failed
> java.io.IOException: Filesystem closed
>     at org.apache.hadoop.dfs.DFSClient.checkOpen(DFSClient.java:196)
>     at org.apache.hadoop.dfs.DFSClient.access$600(DFSClient.java:59)
>     at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:2689)
>     at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.close(DFSClient.java:2655)
>     at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:59)
>     at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:79)
>     at org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:962)
>     at org.apache.hadoop.hbase.regionserver.HLog.close(HLog.java:349)
>     at org.apache.hadoop.hbase.regionserver.HLog.closeAndDelete(HLog.java:333)
>     at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:461)
>     at java.lang.Thread.run(Thread.java:619)
> 2008-12-09 06:37:47,102 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: telling
master that region server is shutting down at: 10.131.237.51:60020
> 2008-12-09 06:37:47,104 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: stopping
server at: 10.131.237.51:60020
> 2008-12-09 06:37:47,882 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker
closing leases
> 2008-12-09 06:37:47,882 INFO org.apache.hadoop.hbase.Leases: regionserver/0:0:0:0:0:0:0:0:60020.leaseChecker
closed leases
> 2008-12-09 06:37:54,919 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: worker
thread exiting
> 2008-12-09 06:37:54,920 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: regionserver/0:0:0:0:0:0:0:0:60020
exiting
> 2008-12-09 06:37:54,920 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Shutdown
thread complete

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message