hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 茅旭峰 <m9s...@gmail.com>
Subject Re: HMaster startup is very slow, and always run into out-of-memory issue
Date Thu, 10 Mar 2011 03:23:30 GMT
Thanks Stack for your reply!

Yes, our application is using big cells, ranging from 4mb - 15mb per entry.

Regarding shutting down of RS because of ZK session loss,

The master logs are below,

========
2011-03-09 10:14:37,511 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
Scanned 6943 catalog row(s) and gc'd 1 unreferenced parent region(s)
2011-03-09 10:14:39,010 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
This client just lost it's session with ZooKeeper, trying to reconnect.
2011-03-09 10:14:39,010 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x12e70bfa76b00c0 has expired,
closing socket connection
2011-03-09 10:14:39,010 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Trying to reconnect to zookeeper
2011-03-09 10:14:39,011 INFO org.apache.zookeeper.ZooKeeper: Initiating
client connection, connectString=cloud140:2181 sessionTimeout=180000
watcher=hconnection
2011-03-09 10:14:39,012 INFO org.apache.zookeeper.ClientCnxn: Opening socket
connection to server cloud140/10.241.67.33:2181
2011-03-09 10:14:39,014 INFO org.apache.zookeeper.ClientCnxn: Socket
connection established to cloud140/10.241.67.33:2181, initiating session
2011-03-09 10:14:39,024 INFO org.apache.zookeeper.ClientCnxn: Unable to
reconnect to ZooKeeper service, session 0x12e70bfa76b00bf has expired,
closing socket connection
2011-03-09 10:14:39,024 FATAL org.apache.hadoop.hbase.master.HMaster:
master:60000-0x12e70bfa76b00bf master:60000-0x12e70bfa76b00bf received
expired from ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired
        at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:328)
        at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:246)
        at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:530)
        at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-03-09 10:14:39,025 INFO org.apache.hadoop.hbase.master.HMaster:
Aborting
2011-03-09 10:14:39,025 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-09 10:14:39,058 INFO org.apache.zookeeper.ClientCnxn: Session
establishment complete on server cloud140/10.241.67.33:2181, sessionid =
0x12e70bfa76b00ce, negotiated timeout = 40000
2011-03-09 10:14:39,074 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Reconnected successfully. This disconnect could have been caused by a
network partition or a long-running GC pause, either way it's recommended
that you verify your environment.
2011-03-09 10:14:39,075 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-09 10:14:39,125 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x12e70bfa76b00bf Unable to get data of znode
/hbase/unassigned/445e3880a1281b2e549ac7a36c83ba4f
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/unassigned/445e3880a1281b2e549ac7a36c83ba4f
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:586)
        at
org.apache.hadoop.hbase.zookeeper.ZKAssign.getDataNoWatch(ZKAssign.java:765)
        at
org.apache.hadoop.hbase.master.AssignmentManager.handleSplitReport(AssignmentManager.java:1754)
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:281)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:639)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
2011-03-09 10:14:39,125 ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
master:60000-0x12e70bfa76b00bf Received unexpected KeeperException,
re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/unassigned/445e3880a1281b2e549ac7a36c83ba4f
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:586)
        at
org.apache.hadoop.hbase.zookeeper.ZKAssign.getDataNoWatch(ZKAssign.java:765)
        at
org.apache.hadoop.hbase.master.AssignmentManager.handleSplitReport(AssignmentManager.java:1754)
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:281)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:639)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
2011-03-09 10:14:39,125 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Exception while validating
RIT during split report
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for
/hbase/unassigned/445e3880a1281b2e549ac7a36c83ba4f
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataNoWatch(ZKUtil.java:586)
        at
org.apache.hadoop.hbase.zookeeper.ZKAssign.getDataNoWatch(ZKAssign.java:765)
        at
org.apache.hadoop.hbase.master.AssignmentManager.handleSplitReport(AssignmentManager.java:1754)
        at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:281)
        at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:639)
        at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
        at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
2011-03-09 10:14:39,516 DEBUG org.apache.hadoop.hbase.master.HMaster:
Stopping service threads
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60000
2011-03-09 10:14:39,516 INFO org.apache.hadoop.hbase.master.HMaster$1:
cloud135:60000-BalancerChore exiting
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60000: exiting
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server listener on 60000
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60000: exiting
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60000: exiting
2011-03-09 10:14:39,516 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60000: exiting
2011-03-09 10:14:39,516 INFO org.apache.hadoop.hbase.master.CatalogJanitor:
cloud135:60000-CatalogJanitor exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server Responder
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60000: exiting
2011-03-09 10:14:39,517 INFO org.apache.hadoop.hbase.master.HMaster:
Stopping infoServer
2011-03-09 10:14:39,517 INFO org.apache.hadoop.hbase.master.LogCleaner:
master-cloud135:60000.oldLogCleaner exiting
2011-03-09 10:14:39,518 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2011-03-09 10:14:39,519 WARN org.apache.hadoop.hbase.zookeeper.ZKUtil:
master:60000-0x12e70bfa76b00bf Unable to get data of znode /hbase/master
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
        at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:180)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
2011-03-09 10:14:39,520 ERROR
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher:
master:60000-0x12e70bfa76b00bf Received unexpected KeeperException,
re-throwing exception
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
        at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:180)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
2011-03-09 10:14:39,520 ERROR
org.apache.hadoop.hbase.master.ActiveMasterManager:
master:60000-0x12e70bfa76b00bf Error deleting our own master address node
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode = Session expired for /hbase/master
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:118)
        at
org.apache.zookeeper.KeeperException.create(KeeperException.java:42)
        at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:921)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAndWatch(ZKUtil.java:549)
        at
org.apache.hadoop.hbase.zookeeper.ZKUtil.getDataAsAddress(ZKUtil.java:620)
        at
org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:180)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:293)
2011-03-09 10:14:39,520 DEBUG
org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker
org.apache.hadoop.hbase.catalog.CatalogTracker@5e29c58e
2011-03-09 10:14:39,521 INFO
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
cloud135:60000.timeoutMonitor exiting
2011-03-09 10:14:39,621 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Closed zookeeper sessionid=0x12e70bfa76b00ce
2011-03-09 10:14:39,623 INFO org.apache.zookeeper.ZooKeeper: Session:
0x12e70bfa76b00ce closed
2011-03-09 10:14:39,623 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-09 10:14:39,623 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
main thread exiting
=======

I'm inclined to think it's not related to swapping, long gc pause or session
expiration, because the timestamp
indicates all that happens in a single second '2011-03-09 10:14:39'.

----- You said
Interesting about the above is the amount of log files we're
splitting.  We have to split 219 files before we get back on line
again.  Thats a lot.  And it'll take a bunch of time.  I wonder if its
because your cell sizes are sort of larger than usual, it causes the
run up in the number of hlogs?  On the regionserver, you should
messages about it trying to clear the WAL logs.  Do you?
-----

What is the recommended or usual size of the cell that hbase means to
support?
I'm not clear about the 'messages about it trying to clear....' issue,
can you explain a bit more?

----- You said
This is also interesting.  Again, I'd guess it your big cells that are
bringing on this new condition.   A bunch of work was done before the
0.90 release to defend against log splitting bringing on OOMEs but
your big cells seem to bring it on.  Can you post more of your master
log?
-----

The master log is like
=====
2011-03-09 17:14:00,943 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path
hdfs://cloud135:9000/hbase/richard/fee41daaa54b8d629a3ed20a4c25109c/recovered.edits/0000000000003130149
(wrote 6 edits in 95ms)
2011-03-09 17:14:00,997 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path
hdfs://cloud135:9000/hbase/richard/ff3a4482f21a69b8b5773cfeedb909f1/recovered.edits/0000000000003131006
(wrote 23 edits in 1812ms)
2011-03-09 17:14:01,009 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path
hdfs://cloud135:9000/hbase/richard/ffcf64cfc0fc36ab8796c6da1b8ab668/recovered.edits/0000000000003132448
(wrote 9 edits in 300ms)
2011-03-09 17:14:01,021 INFO
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Closed path
hdfs://cloud135:9000/hbase/richard/ffdbb9aed85f5ce3a85bd05f9c1a4336/recovered.edits/0000000000003129287
(wrote 9 edits in 242ms)
2011-03-09 17:14:01,026 FATAL org.apache.hadoop.hbase.master.HMaster:
Unhandled exception. Starting shutdown.
java.lang.OutOfMemoryError: Java heap space
        at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1970)
        at org.apache.hadoop.hbase.KeyValue.readFields(KeyValue.java:1977)
        at
org.apache.hadoop.hbase.regionserver.wal.WALEdit.readFields(WALEdit.java:118)
        at
org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:1766)
        at
org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1894)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:198)
        at
org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:172)
        at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.parseHLog(HLogSplitter.java:429)
        at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:262)
        at
org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLog(HLogSplitter.java:188)
        at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:196)
        at
org.apache.hadoop.hbase.master.MasterFileSystem.splitLogAfterStartup(MasterFileSystem.java:180)
        at
org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:379)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:278)
2011-03-09 17:14:01,037 INFO org.apache.hadoop.hbase.master.HMaster:
Aborting
2011-03-09 17:14:01,037 DEBUG org.apache.hadoop.hbase.master.HMaster:
Stopping service threads
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
server on 60000
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 1 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 0 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 2 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 9 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 3 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 5 on 60000: exiting
2011-03-09 17:14:01,037 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 8 on 60000: exiting
2011-03-09 17:14:01,038 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server listener on 60000
2011-03-09 17:14:01,038 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 6 on 60000: exiting
2011-03-09 17:14:01,038 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 7 on 60000: exiting
2011-03-09 17:14:01,038 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
handler 4 on 60000: exiting
2011-03-09 17:14:01,038 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
Server Responder
2011-03-09 17:14:01,038 INFO org.apache.hadoop.hbase.master.HMaster:
Stopping infoServer
2011-03-09 17:14:01,038 INFO org.apache.hadoop.hbase.master.LogCleaner:
master-cloud135:60000.oldLogCleaner exiting
2011-03-09 17:14:01,040 INFO org.mortbay.log: Stopped
SelectChannelConnector@0.0.0.0:60010
2011-03-09 17:14:01,174 DEBUG
org.apache.hadoop.hbase.catalog.CatalogTracker: Stopping catalog tracker
org.apache.hadoop.hbase.catalog.CatalogTracker@1f57ea4a
2011-03-09 17:14:01,174 INFO
org.apache.hadoop.hbase.master.AssignmentManager$TimeoutMonitor:
cloud135:60000.timeoutMonitor exiting
2011-03-09 17:14:01,175 INFO
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation:
Closed zookeeper sessionid=0x22e9990fcfe0006
2011-03-09 17:14:01,183 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-09 17:14:01,184 INFO org.apache.zookeeper.ZooKeeper: Session:
0x22e9990fcfe0006 closed
2011-03-09 17:14:01,254 INFO org.apache.zookeeper.ZooKeeper: Session:
0x22e9990fcfe0005 closed
2011-03-09 17:14:01,254 INFO org.apache.zookeeper.ClientCnxn: EventThread
shut down
2011-03-09 17:14:01,254 INFO org.apache.hadoop.hbase.master.HMaster: HMaster
main thread exiting
Wed Mar  9 17:16:33 CST 2011 Stopping hbase (via master)
=====

Hopefully, it could helps to address this issue. Thanks again for your
reply.


On Thu, Mar 10, 2011 at 2:57 AM, Stack <stack@duboce.net> wrote:

> On Wed, Mar 9, 2011 at 2:05 AM, 茅旭峰 <m9suns@gmail.com> wrote:
> > It works 'well'. The problem is after some sort of stress tests, say
> > launching 20 threads, putting
> > data with restful api, each block of 4mb size, the hmaster always shut
> down
> > due to zookeeper
> > session timeout.
>
> What is 4MB? Each of your entries?
>
> Regards shutting down of RS because of ZK session loss, can you
> correlate the session expiration to anything?  Swapping?  Or long GC
> pause?
>
>
> > After the timeout, I tried to restart the hmaster, then I
> > saw lots of
> >
> > ====
> > 2011-03-09 17:56:16,032 DEBUG
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Pushed=107 entries
> > from
> >
> hdfs://cloud135:9000/hbase/.logs/cloud136,60020,1299572235779/cloud136%3A60020.1299636326073
> > 2011-03-09 17:56:16,032 DEBUG
> > org.apache.hadoop.hbase.regionserver.wal.HLogSplitter: Splitting hlog 134
> of
> > 219:
>
>
> This is the master splitting WAL files.  It must complete before
> regions can come back on line again.
>
> Interesting about the above is the amount of log files we're
> splitting.  We have to split 219 files before we get back on line
> again.  Thats a lot.  And it'll take a bunch of time.  I wonder if its
> because your cell sizes are sort of larger than usual, it causes the
> run up in the number of hlogs?  On the regionserver, you should
> messages about it trying to clear the WAL logs.  Do you?
>
>
> > in the log file, this recovery would take pretty long time, and it would
> > eventually lead to java heap out of memory,
> > even I set HBASE_HEAPSIZE to 4000 in conf/hbase-site.xml.
> >
>
> This is also interesting.  Again, I'd guess it your big cells that are
> bringing on this new condition.   A bunch of work was done before the
> 0.90 release to defend against log splitting bringing on OOMEs but
> your big cells seem to bring it on.  Can you post more of your master
> log?
>
> St.Ack
>
>
> > My question is the recovery takes so long time, and in the meanwhile, I
> > could not do anything, say bin/hbase shell, and list.
> > Is this normal? or some sort of configuration issue? If it's normal, is
> > there any guide line or document address this performance
> > issue? Thanks a lot for your reply.
> >
> > I'm using hbase-0.90.1-CDH3B4.
> >
> > Best regards,
> >
> > Mao Xu-Feng
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message