hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Nazário dos Santos <nazario.lu...@gmail.com>
Subject Re: Data loss
Date Sun, 23 Aug 2009 14:44:22 GMT
Chen, I've tried a couple of times and flush is not working for me in this
particular case (small table, kill -9 right after the table is created).

Andrew, the more frequenty reason for the crash is ZooKeeper (I think). This
is what is happening to me and I couldn't yet overcome the problem (
http://issues.apache.org/jira/browse/HBASE-1645). As written in the issue,
when it happen I have to kill the process.

But anyway, a very simple test you could do is to create a table, insert
three to five registers, flush it and kill -9 HBase. After restarting HBase
the table is gone (at least it's gone for me).

I'm running HBase 0.20.0 RC2 over two machines using Ubuntu.

Hadoop logs seem to be fine. But HBase files are full of errors. Let me post
some of them bellow.

Thanks,
Lucas



REGIONSERVER
2009-08-23 00:11:15,022 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll
/hbase/.logs/server2,60020,1250975459853/hlog.dat.1250993474713, entries=5,
calc$
2009-08-23 02:11:15,564 INFO org.apache.hadoop.hbase.regionserver.HLog: Roll
/hbase/.logs/server2,60020,1250975459853/hlog.dat.1250997075020, entries=16,
cal$
2009-08-23 02:43:56,815 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20006 to sun.nio.ch.SelectionKeyImpl@eaa827
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:43:57,068 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
559225ms, ten times longer than scheduled: 3000
2009-08-23 02:43:56,911 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Scanner
1622072927860071680 lease expired
2009-08-23 02:43:56,820 WARN org.apache.hadoop.hbase.util.Sleeper: We slept
567081ms, ten times longer than scheduled: 10000
2009-08-23 02:43:56,815 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20001 to sun.nio.ch.SelectionKeyImpl@1c5c3fe
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:43:57,857 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Disconnected, type: None, path: null
2009-08-23 02:43:57,888 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 561806 milliseconds - retrying
2009-08-23 02:43:59,455 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:43:59,455 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:43:59,511 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:44468 remote$
2009-08-23 02:43:59,511 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:43:59,566 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:44469 remote$
2009-08-23 02:43:59,566 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:43:59,614 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20006 to sun.nio.ch.SelectionKeyImpl@15d267a
java.io.IOException: Session Expired
        at
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-08-23 02:43:59,621 INFO org.apache.zookeeper.ZooKeeper: Closing
session: 0x12343f29cb20006
2009-08-23 02:44:00,014 INFO org.apache.zookeeper.ClientCnxn: Closing
ClientCnxn for session: 0x12343f29cb20006
2009-08-23 02:44:00,355 INFO org.apache.zookeeper.ClientCnxn: Disconnecting
ClientCnxn for session: 0x12343f29cb20006
2009-08-23 02:44:00,558 INFO org.apache.zookeeper.ZooKeeper: Session:
0x12343f29cb20006 closed
2009-08-23 02:44:00,935 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2009-08-23 02:44:01,341 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20001 to sun.nio.ch.SelectionKeyImpl@894793
java.io.IOException: Session Expired
        at
org.apache.zookeeper.ClientCnxn$SendThread.readConnectResult(ClientCnxn.java:548)
        at
org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:661)
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:897)
2009-08-23 02:44:01,341 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Got ZooKeeper event,
state: Expired, type: None, path: null
2009-08-23 02:44:01,341 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer: ZooKeeper session
expired
2009-08-23 02:44:04,821 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1813 on 60020, call get([B@17d26f0, row=nazario.lucas@gmail.com,
maxVersio$
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1764)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:07,985 ERROR
org.apache.hadoop.hbase.regionserver.HRegionServer:
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:08,146 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1814 on 60020, call openScanner([B@1af6aa, startRow=
http://www.prnewswire.$
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.openScanner(HRegionServer.java:1865)
        at sun.reflect.GeneratedMethodAccessor12.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:08,239 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
(Retry: 0)
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
        at java.lang.Thread.run(Thread.java:619)
2009-08-23 02:44:09,368 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1816 on 60020, call getClosestRowBefore([B@e3cec, [B@c1bb6b,
[B@f79945) fr$
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1748)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:09,382 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 573861 milliseconds - retrying
2009-08-23 02:44:12,636 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1818 on 60020, call getClosestRowBefore([B@186e7b, [B@1fd8093,
[B@4748a) f$
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.getClosestRowBefore(HRegionServer.java:1748)
        at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:12,660 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1815 on 60020, call next(-595952201864210105) from 192.168.1.3:53998:
erro$
java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:14,644 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
(Retry: 1)
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
        at java.lang.Thread.run(Thread.java:619)
2009-08-23 02:44:14,645 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: unable to report to
master for 579124 milliseconds - retrying
2009-08-23 02:44:15,678 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
request=0.0, regions=5, stores=37, storefiles=39, storefile$
2009-08-23 02:44:15,678 WARN
org.apache.hadoop.hbase.regionserver.HRegionServer: Processing message
(Retry: 2)
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkThrowable(RemoteExceptionHandler.java:48)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.checkIOException(RemoteExceptionHandler.java:66)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:572)
        at java.lang.Thread.run(Thread.java:619)
2009-08-23 02:44:15,679 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: Stop was requested,
clearing the toDo despite of the exception
2009-08-23 02:44:15,679 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60020
2009-08-23 02:44:17,328 INFO
org.apache.hadoop.hbase.regionserver.LogFlusher:
regionserver/192.168.1.3:60020.logFlusher exiting
2009-08-23 02:44:17,998 INFO
org.apache.hadoop.hbase.regionserver.CompactSplitThread:
regionserver/192.168.1.3:60020.compactor exiting
2009-08-23 02:44:18,108 INFO
org.apache.hadoop.hbase.regionserver.HRegionServer: worker thread exiting
2009-08-23 02:44:18,335 INFO org.apache.hadoop.hbase.Leases:
regionserver/192.168.1.3:60020.leaseChecker closing leases
2009-08-23 02:44:18,335 INFO org.apache.hadoop.hbase.Leases:
regionserver/192.168.1.3:60020.leaseChecker closed leases
2009-08-23 02:44:18,560 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1296 on 60020: exiting
2009-08-23 02:44:18,560 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1314 on 60020: exiting
2009-08-23 02:44:18,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1217 on 60020: exiting
2009-08-23 02:44:18,691 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60020: exiting
2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60020: exiting
2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60020: exiting
2009-08-23 02:44:18,831 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60020: exiting
2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 13 on 60020: exiting
2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 10 on 60020: exiting
2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 14 on 60020: exiting
2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 16 on 60020: exiting
2009-08-23 02:44:18,832 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 18 on 60020: exiting
2009-08-23 02:44:37,208 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredExce$
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1317)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1308)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1236)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at $Proxy1.addBlock(Unknown Source)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2875)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2757)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.java:2048)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2234)

2009-08-23 02:44:37,301 INFO org.apache.hadoop.hdfs.DFSClient: Waiting for
replication for 10 seconds
2009-08-23 02:44:37,302 WARN org.apache.hadoop.hdfs.DFSClient:
NotReplicatedYetException sleeping
/hbase/.logs/server2,60020,1250975459853/hlog.dat.125100427$
2009-08-23 02:44:37,742 INFO org.apache.hadoop.hdfs.DFSClient:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hdfs.server.namenode.LeaseExpiredExce$
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1317)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:1308)
        at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1236)
        at
org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:422)
        at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:959)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:955)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:953)

        at org.apache.hadoop.ipc.Client.call(Client.java:739)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
        at $Proxy1.addBlock(Unknown Source)

MASTER
2009-08-23 02:35:32,736 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@108c1b7
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:35:34,387 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@b92956
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:35:34,410 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:35:34,410 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:50441 remote$
2009-08-23 02:35:34,467 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:50444 remote$
2009-08-23 02:35:38,463 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:35:57,987 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020,
regionnam$
2009-08-23 02:35:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 2
region servers, 0 dead, average load 5.0
2009-08-23 02:36:04,528 INFO org.apache.hadoop.hbase.master.ServerManager:
server2,60020,1250975459853 znode expired
2009-08-23 02:36:04,533 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scan of 1 row(s) of meta region {server:
192.168.1.2:60020$
2009-08-23 02:36:06,779 INFO org.apache.hadoop.hbase.master.RegionManager:
META region removed from onlineMetaRegions
2009-08-23 02:36:06,779 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of
server server2,60020,1250975459853: logSplit: false, r$
2009-08-23 02:36:36,769 INFO org.apache.hadoop.hbase.regionserver.HLog:
Splitting 8 hlog(s) in
hdfs://server2:9000/hbase/.logs/server2,60020,1250975459853
2009-08-23 02:36:57,775 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@508204
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:36:57,987 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scanning meta region {server: 192.168.1.2:60020,
regionnam$
2009-08-23 02:36:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 1
region servers, 1 dead, average load 5.0[server2,60020,1250975459853]
2009-08-23 02:36:59,487 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:36:59,488 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:42560 remote$
2009-08-23 02:36:59,488 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:36:59,657 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@f16ddc
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:42563 remote$
2009-08-23 02:37:01,434 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:37:02,727 INFO org.apache.hadoop.hbase.master.BaseScanner:
RegionManager.rootScanner scan of 1 row(s) of meta region {server:
192.168.1.2:60020$
2009-08-23 02:37:32,729 INFO org.apache.hadoop.hbase.master.ServerManager:
server,60020,1250975507256 znode expired
2009-08-23 02:37:34,118 INFO org.apache.hadoop.hbase.master.RegionManager:
-ROOT- region unset (but not set to be reassigned)
2009-08-23 02:37:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:38:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:39:15,337 WARN org.apache.hadoop.hbase.regionserver.HLog:
Empty hlog, continuing: org.apache.hadoop.fs.FileStatus@9d684e10 count=0
java.io.EOFException
        at java.io.DataInputStream.readFully(DataInputStream.java:180)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
        at
org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1450)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1428)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1417)
        at
org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1412)
        at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:876)
        at org.apache.hadoop.hbase.regionserver.HLog.splitLog(HLog.java:802)
        at
org.apache.hadoop.hbase.master.ProcessServerShutdown.process(ProcessServerShutdown.java:274)
        at
org.apache.hadoop.hbase.master.HMaster.processToDoQueue(HMaster.java:492)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:426)
2009-08-23 02:39:17,375 INFO org.apache.hadoop.hbase.regionserver.HLog: hlog
file splitting completed in 189186 millis for
hdfs://server2:9000/hbase/.logs/se$
2009-08-23 02:39:17,376 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete,
meta reassignment and scanning:
2009-08-23 02:39:17,376 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown
setting to unassigned: {server: 192.168.1.3:60020, r$
2009-08-23 02:39:17,376 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: process shutdown of
server server,60020,1250975507256: logSplit: false, ro$
2009-08-23 02:39:17,382 INFO org.apache.hadoop.hbase.regionserver.HLog:
Splitting 9 hlog(s) in
hdfs://server2:9000/hbase/.logs/server,60020,1250975507256
2009-08-23 02:39:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:40:20,775 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@110c0df
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:40:21,585 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@e696f9
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:40:22,432 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:40:22,433 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:43272 remote$
2009-08-23 02:40:22,433 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:43273 remote$
2009-08-23 02:40:22,854 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:40:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:41:11,930 INFO org.apache.hadoop.hbase.regionserver.HLog: hlog
file splitting completed in 114553 millis for
hdfs://server2:9000/hbase/.logs/se$
2009-08-23 02:41:11,930 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: Log split complete,
meta reassignment and scanning:
2009-08-23 02:41:12,135 INFO
org.apache.hadoop.hbase.master.RegionServerOperation: ProcessServerShutdown
reassigning ROOT region
2009-08-23 02:41:12,135 INFO org.apache.hadoop.hbase.master.RegionManager:
-ROOT- region unset (but not set to be reassigned)
2009-08-23 02:41:12,135 INFO org.apache.hadoop.hbase.master.RegionManager:
ROOT inserted into regionsInTransition
2009-08-23 02:41:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:43:29,197 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20000 to sun.nio.ch.SelectionKeyImpl@d89c2
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:43:29,755 WARN org.apache.zookeeper.ClientCnxn: Exception
closing session 0x12343f29cb20003 to sun.nio.ch.SelectionKeyImpl@cf209b
java.io.IOException: TIMED OUT
        at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:858)
2009-08-23 02:43:31,021 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:43:31,022 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:44084 remote$
2009-08-23 02:43:31,081 INFO org.apache.zookeeper.ClientCnxn: Attempting
connection to server server2/192.168.1.3:2181
2009-08-23 02:43:31,081 INFO org.apache.zookeeper.ClientCnxn: Priming
connection to java.nio.channels.SocketChannel[connected local=/
192.168.1.3:44085 remote$
2009-08-23 02:43:31,091 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:43:31,091 INFO org.apache.zookeeper.ClientCnxn: Server
connection successful
2009-08-23 02:43:58,228 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:44:06,668 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 67 on 60000, call regionServerReport(address: 192.168.1.3:60020,
startcode$
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:09,991 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 68 on 60000, call regionServerReport(address: 192.168.1.3:60020,
startcode$
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:15,594 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 69 on 60000, call regionServerReport(address: 192.168.1.3:60020,
startcode$
org.apache.hadoop.hbase.Leases$LeaseStillHeldException
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:250)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:722)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)
2009-08-23 02:44:24,441 ERROR org.apache.hadoop.hbase.master.BaseScanner:
Closing scanner
java.io.IOException: Call to /192.168.1.3:60020 failed on local exception:
java.io.EOFException
        at
org.apache.hadoop.hbase.ipc.HBaseClient.wrapException(HBaseClient.java:757)
        at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:727)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:328)
        at $Proxy1.close(Unknown Source)
        at
org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:203)
        at
org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
        at
org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
        at
org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:135)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
Caused by: java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:375)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.receiveResponse(HBaseClient.java:505)
        at
org.apache.hadoop.hbase.ipc.HBaseClient$Connection.run(HBaseClient.java:449)
2009-08-23 02:44:24,474 WARN org.apache.hadoop.hbase.master.BaseScanner:
Scan one META region: {server: 192.168.1.3:60020, regionname: .META.,,1,
startKey: <$
java.io.IOException: java.io.IOException: Server not running, aborting
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2299)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1914)
        at
org.apache.hadoop.hbase.regionserver.HRegionServer.next(HRegionServer.java:1899)
        at sun.reflect.GeneratedMethodAccessor8.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:650)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915)

        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
        at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
        at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
        at
org.apache.hadoop.hbase.RemoteExceptionHandler.decodeRemoteException(RemoteExceptionHandler.java:94)
        at
org.apache.hadoop.hbase.master.BaseScanner.scanRegion(BaseScanner.java:191)
        at
org.apache.hadoop.hbase.master.MetaScanner.scanOneMetaRegion(MetaScanner.java:73)
        at
org.apache.hadoop.hbase.master.MetaScanner.maintenanceScan(MetaScanner.java:129)
        at
org.apache.hadoop.hbase.master.BaseScanner.chore(BaseScanner.java:135)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:68)
2009-08-23 02:44:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
1 .META. region(s) scanned
2009-08-23 02:44:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:44:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:45:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:45:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:46:24,475 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:46:58,104 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:47:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:47:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:48:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:48:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:49:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:49:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:50:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:50:58,104 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:51:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:51:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:52:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:52:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:53:24,474 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:53:58,103 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:54:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:54:58,104 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:55:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:55:58,104 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:56:24,478 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned
2009-08-23 02:56:58,104 INFO org.apache.hadoop.hbase.master.ServerManager: 0
region servers, 2 dead, average load NaN[server,60020,1250975507256,
server2,600$
2009-08-23 02:57:24,479 INFO org.apache.hadoop.hbase.master.BaseScanner: All
0 .META. region(s) scanned



On Sun, Aug 23, 2009 at 3:07 AM, Andrew Purtell <apurtell@apache.org> wrote:

> See http://issues.apache.org/jira/browse/HDFS-200. I think support for
> what HBase needs from HDFS is almost there. Alternatively you could consider
> a different type of underlying filesystem -- Lustre, Gluster, etc.
>
> However, you say that HBase crashes for a very small table. I wonder why
> that is. For example I have a table with more than 1TB of data and hundreds
> of regions over 4 servers only and HBase is stable. What hardware resources
> are you running HBase on (CPU, RAM, disk, number of servers, etc.)? What
> other processes are you running on those servers? What are your observations
> around the time of the crash? For example, have you noticed exceptions in
> the HBase or DFS logs? I think getting to the bottom of this would help you
> more than explicit flushing as a workaround.
>
>   - Andy
>
> 2009/8/23 Lucas Nazário dos Santos <nazario.lucas@gmail.com>
>
> > Hi,
> >
> > I have a very small table under HBase that I use to store control data of
> > my program. If HBase crashes and I have to kill it, the table goes away.
> > However, I can still see the table inside HDFS.
> >
> > What is more interesting is that if I stop HBase properly, the table
> seems
> > to be persisted and I don't lose it anymore as consequence of a crash.
> >
> > Because the table is very small, I think it's not being flushed to HDFS
> > (the table itself or/and the meta info). I try to flush everything with
> > HTable#flushCommits and HBaseAdmin#flush with no success. Does anybody
> has
> > already gone through it? How can a flush EVERYTHING to HDFS so I won't
> lose
> > data as consequence of a kill -9? Any special configuration inside
> > hbase-site.xml?
> >
> > I'm using HBase 0.20.0 RC2 together with Hadoop 0.20.0.
> >
> > Thanks,
> > Lucas
> >
>
>
>
> --
> Best Regards,
> Chen Xinli
>
>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message