hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Region Server lost response when doing BatchUpdate
Date Mon, 13 Apr 2009 12:40:57 GMT
I see that your region server had 5188 store files in 121 store, I'm
99% sure that it's the cause of your OOME. Luckily for you, we've been
working on this issue since last week. What you should do :

- Upgrade to HBase 0.19.1

- Apply the latest patch in
https://issues.apache.org/jira/browse/HBASE-1058 (the v3)

Then you should be good. As to what caused this huge number of store
files, I wouldn't be surprised if your data was uploaded sequentially
so that would mean that whatever the number of regions (hence the
level of distribution) in your table, only 1 region gets the load.
This implies that another work around to your problem would be to
insert with a more randomized pattern.

Thx for trying either solution,

J-D

On Mon, Apr 13, 2009 at 8:28 AM, 11 Nov. <nov.eleventh@gmail.com> wrote:
> hi coleagues,
>    We are doing data inserting on 32 nodes hbase cluster using mapreduce
> framework recently, but the operation always gets failed because of
> regionserver exceptions. We issued 4 map task on the same node
> simultaneously, and exploit the BatchUpdate() function to handle work of
> inserting data.
>    We had been suffered from such problem since last month, which only took
> place on relatively large clusters at high concurrent inserting rate. We are
> using hadoop-0.19.2 on current svn, and it's the head revision on svn last
> week. We are using hbase 0.19.0.
>
>    Here is the configure file of hadoop-site.xml:
>
> <configuration>
> <property>
>  <name>fs.default.name</name>
>  <value>hdfs://192.168.33.204:11004/</value>
> </property>
>
> <property>
>  <name>mapred.job.tracker</name>
>  <value>192.168.33.204:11005</value>
> </property>
>
> <property>
>  <name>dfs.secondary.http.address</name>
>  <value>0.0.0.0:51100</value>
>  <description>
>    The secondary namenode http server address and port.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.address</name>
>  <value>0.0.0.0:51110</value>
>  <description>
>    The address where the datanode server will listen to.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.http.address</name>
>  <value>0.0.0.0:51175</value>
>  <description>
>    The datanode http server address and port.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.ipc.address</name>
>  <value>0.0.0.0:11010</value>
>  <description>
>    The datanode ipc server address and port.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.handler.count</name>
>  <value>30</value>
>  <description>The number of server threads for the datanode.</description>
> </property>
>
> <property>
>  <name>dfs.namenode.handler.count</name>
>  <value>30</value>
>  <description>The number of server threads for the namenode.</description>
> </property>
>
> <property>
>  <name>mapred.job.tracker.handler.count</name>
>  <value>30</value>
> </property>
>
> <property>
>  <name>mapred.reduce.parallel.copies</name>
>  <value>30</value>
> </property>
>
> <property>
>  <name>dfs.http.address</name>
>  <value>0.0.0.0:51170</value>
>  <description>
>    The address and the base port where the dfs namenode web ui will listen
> on.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.max.xcievers</name>
>  <value>8192</value>
>  <description>
>  </description>
> </property>
>
> <property>
>  <name>dfs.datanode.socket.write.timeout</name>
>  <value>0</value>
>  <description>
>  </description>
> </property>
>
>
> <property>
>  <name>dfs.datanode.https.address</name>
>  <value>0.0.0.0:50477</value>
> </property>
>
> <property>
>  <name>dfs.https.address</name>
>  <value>0.0.0.0:50472</value>
> </property>
>
> <property>
>  <name>mapred.job.tracker.http.address</name>
>  <value>0.0.0.0:51130</value>
>  <description>
>    The job tracker http server address and port the server will listen on.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
> <property>
>  <name>mapred.task.tracker.http.address</name>
>  <value>0.0.0.0:51160</value>
>  <description>
>    The task tracker http server address and port.
>    If the port is 0 then the server will start on a free port.
>  </description>
> </property>
>
>
> <property>
>  <name>mapred.map.tasks</name>
>  <value>3</value>
> </property>
>
> <property>
>  <name>mapred.reduce.tasks</name>
>  <value>2</value>
> </property>
>
> <property>
>  <name>mapred.tasktracker.map.tasks.maximum</name>
>  <value>4</value>
>  <description>
>        The maximum number of map tasks that will be run simultaneously by a
> task tracker.
>  </description>
> </property>
>
> <property>
>  <name>dfs.name.dir</name>
>
> <value>/data0/hbase/filesystem/dfs/name,/data1/hbase/filesystem/dfs/name,/data2/hbase/filesystem/dfs/name,/data3/hbase/filesystem/dfs/name</value>
> </property>
>
> <property>
>  <name>dfs.data.dir</name>
>
> <value>/data0/hbase/filesystem/dfs/data,/data1/hbase/filesystem/dfs/data,/data2/hbase/filesystem/dfs/data,/data3/hbase/filesystem/dfs/data</value>
> </property>
>
> <property>
>  <name>fs.checkpoint.dir</name>
>
> <value>/data0/hbase/filesystem/dfs/namesecondary,/data1/hbase/filesystem/dfs/namesecondary,/data2/hbase/filesystem/dfs/namesecondary,/data3/hbase/filesystem/dfs/namesecondary</value>
> </property>
>
> <property>
>  <name>mapred.system.dir</name>
>  <value>/data1/hbase/filesystem/mapred/system</value>
> </property>
>
> <property>
>  <name>mapred.local.dir</name>
>
> <value>/data0/hbase/filesystem/mapred/local,/data1/hbase/filesystem/mapred/local,/data2/hbase/filesystem/mapred/local,/data3/hbase/filesystem/mapred/local</value>
> </property>
>
> <property>
>  <name>dfs.replication</name>
>  <value>3</value>
> </property>
>
> <property>
>  <name>hadoop.tmp.dir</name>
>  <value>/data1/hbase/filesystem/tmp</value>
> </property>
>
> <property>
>  <name>mapred.task.timeout</name>
>  <value>3600000</value>
>  <description>The number of milliseconds before a task will be
>  terminated if it neither reads an input, writes an output, nor
>  updates its status string.
>  </description>
> </property>
>
> <property>
>  <name>ipc.client.idlethreshold</name>
>  <value>4000</value>
>  <description>Defines the threshold number of connections after which
>               connections will be inspected for idleness.
>  </description>
> </property>
>
>
> <property>
>  <name>ipc.client.connection.maxidletime</name>
>  <value>120000</value>
>  <description>The maximum time in msec after which a client will bring down
> the
>               connection to the server.
>  </description>
> </property>
>
> <property>
>  <value>-Xmx256m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode</value>
> </property>
>
> </configuration>
>
>
>
>
>
>
>    And here is the hbase-site.xml config file:
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>
> <configuration>
>  <property>
>    <name>hbase.master</name>
>    <value>192.168.33.204:62000</value>
>    <description>The host and port that the HBase master runs at.
>    A value of 'local' runs the master and a regionserver in
>    a single process.
>    </description>
>  </property>
>  <property>
>    <name>hbase.rootdir</name>
>    <value>hdfs://192.168.33.204:11004/hbase</value>
>    <description>The directory shared by region servers.
>    Should be fully-qualified to include the filesystem to use.
>    E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR
>    </description>
>  </property>
>
>  <property>
>    <name>hbase.master.info.port</name>
>    <value>62010</value>
>    <description>The port for the hbase master web UI
>    Set to -1 if you do not want the info server to run.
>    </description>
>  </property>
>  <property>
>    <name>hbase.master.info.bindAddress</name>
>    <value>0.0.0.0</value>
>    <description>The address for the hbase master web UI
>    </description>
>  </property>
>  <property>
>    <name>hbase.regionserver</name>
>    <value>0.0.0.0:62020</value>
>    <description>The host and port a HBase region server runs at.
>    </description>
>  </property>
>
>  <property>
>    <name>hbase.regionserver.info.port</name>
>    <value>62030</value>
>    <description>The port for the hbase regionserver web UI
>    Set to -1 if you do not want the info server to run.
>    </description>
>  </property>
>  <property>
>    <name>hbase.regionserver.info.bindAddress</name>
>    <value>0.0.0.0</value>
>    <description>The address for the hbase regionserver web UI
>    </description>
>  </property>
>
>  <property>
>    <name>hbase.regionserver.handler.count</name>
>    <value>20</value>
>  </property>
>
>  <property>
>    <name>hbase.master.lease.period</name>
>    <value>180000</value>
>  </property>
>
> </configuration>
>
>
>    Here is a slice of the error log file on one of the failed
> regionservers, which lose response after the OOM Exception:
>
> 2009-04-13 15:20:26,077 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
> aborting.
> java.lang.OutOfMemoryError: Java heap space
> 2009-04-13 15:20:48,062 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0, regions=121, stores=121, storefiles=5188, storefileIndexSize=195,
> memcacheSize=214, usedHeap=4991, maxHeap=4991
> 2009-04-13 15:20:48,062 INFO org.apache.hadoop.ipc.HBaseServer: Stopping
> server on 62020
> 2009-04-13 15:20:48,063 INFO
> org.apache.hadoop.hbase.regionserver.LogFlusher:
> regionserver/0:0:0:0:0:0:0:0:62020.logFlusher exiting
> 2009-04-13 15:20:48,201 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer
> 2009-04-13 15:20:48,228 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@74f0bb4e,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@689939dc) from
> 192.168.33.206:47754: output error
> 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 5 on 62020: exiting
> 2009-04-13 15:20:48,297 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
> Server Responder
> 2009-04-13 15:20:48,552 INFO org.apache.zookeeper.ClientCnxn: Attempting
> connection to server /192.168.33.204:2181
> 2009-04-13 15:20:48,552 WARN org.apache.zookeeper.ClientCnxn: Exception
> closing session 0x0 to sun.nio.ch.SelectionKeyImpl@480edf31
> java.io.IOException: TIMED OUT
>    at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:837)
> 2009-04-13 15:20:48,555 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 62020, call batchUpdates([B@3509aa7f,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44367:
> error: java.io.IOException: Server not running, aborting
> java.io.IOException: Server not running, aborting
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@525a19ce,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@19544d9f) from
> 192.168.33.208:47852: output error
> 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@483206fe,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4c6932b9) from
> 192.168.33.221:37020: output error
> 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 0 on 62020: exiting
> 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:48,655 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 7 on 62020: exiting
> 2009-04-13 15:20:48,692 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@61af3c0e,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@378fed3c) from 192.168.34.1:35923:
> output error
> 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@2c4ff8dd,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@365b8be5) from 192.168.34.3:39443:
> output error
> 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 16 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 16 on 62020: exiting
> 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@343d8344,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@32750027) from
> 192.168.33.236:45479: output error
> 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 17 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 17 on 62020: exiting
> 2009-04-13 15:20:48,654 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@3ff34fed,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@7f047167) from
> 192.168.33.219:40059: output error
> 2009-04-13 15:20:48,654 ERROR com.cmri.hugetable.zookeeper.ZNodeWatcher:
> processNode /hugetable09/hugetable/acl.lock error!KeeperErrorCode =
> ConnectionLoss
> 2009-04-13 15:20:48,649 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@721d9b81,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@75cc6cae) from
> 192.168.33.254:51617: output error
> 2009-04-13 15:20:48,649 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 12 on 62020, call batchUpdates([B@655edc27,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from
> 192.168.33.238:51231: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@3c853cce,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4f5b176c) from
> 192.168.33.209:43520: output error
> 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,226 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 4 on 62020: exiting
> 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@3509aa7f,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44367:
> output error
> 2009-04-13 15:20:48,647 INFO org.mortbay.util.ThreadedServer: Stopping
> Acceptor ServerSocket[addr=0.0.0.0/0.0.0.0,port=0,localport=62030]
> 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 9 on 62020: exiting
> 2009-04-13 15:20:48,646 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 2 on 62020, call batchUpdates([B@2cc91b6,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from
> 192.168.33.210:44154: error: java.io.IOException: Server not running,
> aborting
> java.io.IOException: Server not running, aborting
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServer.java:2809)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2304)
>    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>    at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-04-13 15:20:48,572 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@e8136e0,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4539b390) from
> 192.168.33.217:60476: output error
> 2009-04-13 15:20:49,272 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@2cc91b6,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from
> 192.168.33.210:44154: output error
> 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 8 on 62020: exiting
> 2009-04-13 15:20:49,263 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@655edc27,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from
> 192.168.33.238:51231: output error
> 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 1 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,068 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020 caught: java.nio.channels.ClosedByInterruptException
>    at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:184)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
> 2009-04-13 15:20:49,345 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 14 on 62020: exiting
> 2009-04-13 15:20:49,048 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer:
> java.lang.OutOfMemoryError: Java heap space
> 2009-04-13 15:20:49,484 FATAL
> org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError,
> aborting.
> java.lang.OutOfMemoryError: Java heap space
>    at
> java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863)
>    at
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307)
>    at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> 2009-04-13 15:20:49,488 INFO
> org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics:
> request=0, regions=121, stores=121, storefiles=5188, storefileIndexSize=195,
> memcacheSize=214, usedHeap=4985, maxHeap=4991
> 2009-04-13 15:20:49,489 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 15 on 62020, call batchUpdates([B@302bb17f,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35276:
> error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space
> java.io.IOException: java.lang.OutOfMemoryError: Java heap space
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1334)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1324)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2320)
>    at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source)
>    at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>    at java.lang.reflect.Method.invoke(Method.java:597)
>    at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:632)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895)
> Caused by: java.lang.OutOfMemoryError: Java heap space
>    at
> java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.java:1187)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(HRegionServer.java:2863)
>    at
> org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemory(MemcacheFlusher.java:260)
>    at
> org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionServer.java:2307)
>    ... 5 more
> 2009-04-13 15:20:49,490 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server
> Responder, call batchUpdates([B@302bb17f,
> [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35276:
> output error
> 2009-04-13 15:20:49,047 INFO org.apache.hadoop.ipc.HBaseServer: Stopping IPC
> Server listener on 62020
> 2009-04-13 15:20:49,493 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server
> handler 15 on 62020 caught: java.nio.channels.ClosedChannelException
>    at
> sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126)
>    at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:1085)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseServer.java:593)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.java:657)
>    at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923)
>
>    Any suggenstion is welcomed! Thanks a lot!
>

Mime
View raw message