Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 52134 invoked from network); 13 Apr 2009 12:41:28 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Apr 2009 12:41:28 -0000 Received: (qmail 30045 invoked by uid 500); 13 Apr 2009 12:41:28 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 29951 invoked by uid 500); 13 Apr 2009 12:41:27 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 29941 invoked by uid 99); 13 Apr 2009 12:41:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Apr 2009 12:41:27 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jdcryans@gmail.com designates 209.85.218.176 as permitted sender) Received: from [209.85.218.176] (HELO mail-bw0-f176.google.com) (209.85.218.176) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Apr 2009 12:41:19 +0000 Received: by bwz24 with SMTP id 24so2089893bwz.29 for ; Mon, 13 Apr 2009 05:40:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type:content-transfer-encoding; bh=gYIlDLH/eQ3OuBrU8ZlK0o318kWGsleuorMnx5ux4MQ=; b=B9RiylC0mF8uDr8sX+QU/mXDxyCp407iGc8/4ZV8cta8o5CEUXAN/dcihDKdDgG58i AUnuk0qx6hDhrg1LWNCdPouqM7U50sgg3Be9jOFJLPROreeMgHpyyq+1FnaBAGP+3HUG MWsgea/d0OehNfI+IfUcrZVOVU8rfxUSrZIqc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=b0pw7mWFi4YCIRmjUubYAI5snOA+9Lm2eZVD45JzXNa7Bdly3Igwe0f/bNJyuh/bDk kP0xH3Lc2EDaQyLm6taft9k+pjjric7O/XVsEsvsyD8117OJEKpUzOUlVBZ7SJVOC+3g ADGxOPPaOUMgS/+ZM9rdkgGEelYYdTGXp5O80= MIME-Version: 1.0 Sender: jdcryans@gmail.com Received: by 10.223.109.198 with SMTP id k6mr1706375fap.46.1239626457430; Mon, 13 Apr 2009 05:40:57 -0700 (PDT) In-Reply-To: <4751f7df0904130528o3ed8d79aic7d9f936b1e6e1e5@mail.gmail.com> References: <4751f7df0904130528o3ed8d79aic7d9f936b1e6e1e5@mail.gmail.com> Date: Mon, 13 Apr 2009 08:40:57 -0400 X-Google-Sender-Auth: 85a2f840c36690df Message-ID: <31a243e70904130540q73bfff24p94b13675d2305284@mail.gmail.com> Subject: Re: Region Server lost response when doing BatchUpdate From: Jean-Daniel Cryans To: hbase-user@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org I see that your region server had 5188 store files in 121 store, I'm 99% sure that it's the cause of your OOME. Luckily for you, we've been working on this issue since last week. What you should do : - Upgrade to HBase 0.19.1 - Apply the latest patch in https://issues.apache.org/jira/browse/HBASE-1058 (the v3) Then you should be good. As to what caused this huge number of store files, I wouldn't be surprised if your data was uploaded sequentially so that would mean that whatever the number of regions (hence the level of distribution) in your table, only 1 region gets the load. This implies that another work around to your problem would be to insert with a more randomized pattern. Thx for trying either solution, J-D On Mon, Apr 13, 2009 at 8:28 AM, 11 Nov. wrote: > hi coleagues, > =A0 =A0We are doing data inserting on 32 nodes hbase cluster using mapred= uce > framework recently, but the operation always gets failed because of > regionserver exceptions. We issued 4 map task on the same node > simultaneously, and exploit the BatchUpdate() function to handle work of > inserting data. > =A0 =A0We had been suffered from such problem since last month, which onl= y took > place on relatively large clusters at high concurrent inserting rate. We = are > using hadoop-0.19.2 on current svn, and it's the head revision on svn las= t > week. We are using hbase 0.19.0. > > =A0 =A0Here is the configure file of hadoop-site.xml: > > > > =A0fs.default.name > =A0hdfs://192.168.33.204:11004/ > > > > =A0mapred.job.tracker > =A0192.168.33.204:11005 > > > > =A0dfs.secondary.http.address > =A00.0.0.0:51100 > =A0 > =A0 =A0The secondary namenode http server address and port. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0dfs.datanode.address > =A00.0.0.0:51110 > =A0 > =A0 =A0The address where the datanode server will listen to. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0dfs.datanode.http.address > =A00.0.0.0:51175 > =A0 > =A0 =A0The datanode http server address and port. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0dfs.datanode.ipc.address > =A00.0.0.0:11010 > =A0 > =A0 =A0The datanode ipc server address and port. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0dfs.datanode.handler.count > =A030 > =A0The number of server threads for the datanode. > > > > =A0dfs.namenode.handler.count > =A030 > =A0The number of server threads for the namenode. > > > > =A0mapred.job.tracker.handler.count > =A030 > > > > =A0mapred.reduce.parallel.copies > =A030 > > > > =A0dfs.http.address > =A00.0.0.0:51170 > =A0 > =A0 =A0The address and the base port where the dfs namenode web ui will l= isten > on. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0dfs.datanode.max.xcievers > =A08192 > =A0 > =A0 > > > > =A0dfs.datanode.socket.write.timeout > =A00 > =A0 > =A0 > > > > > =A0dfs.datanode.https.address > =A00.0.0.0:50477 > > > > =A0dfs.https.address > =A00.0.0.0:50472 > > > > =A0mapred.job.tracker.http.address > =A00.0.0.0:51130 > =A0 > =A0 =A0The job tracker http server address and port the server will liste= n on. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > =A0mapred.task.tracker.http.address > =A00.0.0.0:51160 > =A0 > =A0 =A0The task tracker http server address and port. > =A0 =A0If the port is 0 then the server will start on a free port. > =A0 > > > > > =A0mapred.map.tasks > =A03 > > > > =A0mapred.reduce.tasks > =A02 > > > > =A0mapred.tasktracker.map.tasks.maximum > =A04 > =A0 > =A0 =A0 =A0 =A0The maximum number of map tasks that will be run simultane= ously by a > task tracker. > =A0 > > > > =A0dfs.name.dir > > /data0/hbase/filesystem/dfs/name,/data1/hbase/filesystem/dfs/name,= /data2/hbase/filesystem/dfs/name,/data3/hbase/filesystem/dfs/name > > > > =A0dfs.data.dir > > /data0/hbase/filesystem/dfs/data,/data1/hbase/filesystem/dfs/data,= /data2/hbase/filesystem/dfs/data,/data3/hbase/filesystem/dfs/data > > > > =A0fs.checkpoint.dir > > /data0/hbase/filesystem/dfs/namesecondary,/data1/hbase/filesystem/= dfs/namesecondary,/data2/hbase/filesystem/dfs/namesecondary,/data3/hbase/fi= lesystem/dfs/namesecondary > > > > =A0mapred.system.dir > =A0/data1/hbase/filesystem/mapred/system > > > > =A0mapred.local.dir > > /data0/hbase/filesystem/mapred/local,/data1/hbase/filesystem/mapre= d/local,/data2/hbase/filesystem/mapred/local,/data3/hbase/filesystem/mapred= /local > > > > =A0dfs.replication > =A03 > > > > =A0hadoop.tmp.dir > =A0/data1/hbase/filesystem/tmp > > > > =A0mapred.task.timeout > =A03600000 > =A0The number of milliseconds before a task will be > =A0terminated if it neither reads an input, writes an output, nor > =A0updates its status string. > =A0 > > > > =A0ipc.client.idlethreshold > =A04000 > =A0Defines the threshold number of connections after which > =A0 =A0 =A0 =A0 =A0 =A0 =A0 connections will be inspected for idleness. > =A0 > > > > > =A0ipc.client.connection.maxidletime > =A0120000 > =A0The maximum time in msec after which a client will bring = down > the > =A0 =A0 =A0 =A0 =A0 =A0 =A0 connection to the server. > =A0 > > > > =A0-Xmx256m -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode > > > > > > > > > > =A0 =A0And here is the hbase-site.xml config file: > > > > > > =A0 > =A0 =A0hbase.master > =A0 =A0192.168.33.204:62000 > =A0 =A0The host and port that the HBase master runs at. > =A0 =A0A value of 'local' runs the master and a regionserver in > =A0 =A0a single process. > =A0 =A0 > =A0 > =A0 > =A0 =A0hbase.rootdir > =A0 =A0hdfs://192.168.33.204:11004/hbase > =A0 =A0The directory shared by region servers. > =A0 =A0Should be fully-qualified to include the filesystem to use. > =A0 =A0E.g: hdfs://NAMENODE_SERVER:PORT/HBASE_ROOTDIR > =A0 =A0 > =A0 > > =A0 > =A0 =A0hbase.master.info.port > =A0 =A062010 > =A0 =A0The port for the hbase master web UI > =A0 =A0Set to -1 if you do not want the info server to run. > =A0 =A0 > =A0 > =A0 > =A0 =A0hbase.master.info.bindAddress > =A0 =A00.0.0.0 > =A0 =A0The address for the hbase master web UI > =A0 =A0 > =A0 > =A0 > =A0 =A0hbase.regionserver > =A0 =A00.0.0.0:62020 > =A0 =A0The host and port a HBase region server runs at. > =A0 =A0 > =A0 > > =A0 > =A0 =A0hbase.regionserver.info.port > =A0 =A062030 > =A0 =A0The port for the hbase regionserver web UI > =A0 =A0Set to -1 if you do not want the info server to run. > =A0 =A0 > =A0 > =A0 > =A0 =A0hbase.regionserver.info.bindAddress > =A0 =A00.0.0.0 > =A0 =A0The address for the hbase regionserver web UI > =A0 =A0 > =A0 > > =A0 > =A0 =A0hbase.regionserver.handler.count > =A0 =A020 > =A0 > > =A0 > =A0 =A0hbase.master.lease.period > =A0 =A0180000 > =A0 > > > > > =A0 =A0Here is a slice of the error log file on one of the failed > regionservers, which lose response after the OOM Exception: > > 2009-04-13 15:20:26,077 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, > aborting. > java.lang.OutOfMemoryError: Java heap space > 2009-04-13 15:20:48,062 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=3D0, regions=3D121, stores=3D121, storefiles=3D5188, storefileInd= exSize=3D195, > memcacheSize=3D214, usedHeap=3D4991, maxHeap=3D4991 > 2009-04-13 15:20:48,062 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 62020 > 2009-04-13 15:20:48,063 INFO > org.apache.hadoop.hbase.regionserver.LogFlusher: > regionserver/0:0:0:0:0:0:0:0:62020.logFlusher exiting > 2009-04-13 15:20:48,201 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Stopping infoServer > 2009-04-13 15:20:48,228 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@74f0bb4e, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@689939dc) from > 192.168.33.206:47754: output error > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 5 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,229 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 5 on 62020: exiting > 2009-04-13 15:20:48,297 INFO org.apache.hadoop.ipc.HBaseServer: Stopping = IPC > Server Responder > 2009-04-13 15:20:48,552 INFO org.apache.zookeeper.ClientCnxn: Attempting > connection to server /192.168.33.204:2181 > 2009-04-13 15:20:48,552 WARN org.apache.zookeeper.ClientCnxn: Exception > closing session 0x0 to sun.nio.ch.SelectionKeyImpl@480edf31 > java.io.IOException: TIMED OUT > =A0 =A0at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:= 837) > 2009-04-13 15:20:48,555 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 9 on 62020, call batchUpdates([B@3509aa7f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44= 367: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServe= r.java:2809) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2304) > =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =A0 =A0at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java= :39) > =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:= 632) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@525a19ce, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@19544d9f) from > 192.168.33.208:47852: output error > 2009-04-13 15:20:48,561 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@483206fe, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4c6932b9) from > 192.168.33.221:37020: output error > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 0 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 0 on 62020: exiting > 2009-04-13 15:20:48,561 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 7 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,655 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 7 on 62020: exiting > 2009-04-13 15:20:48,692 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@61af3c0e, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@378fed3c) from 192.168.34.1:359= 23: > output error > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@2c4ff8dd, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@365b8be5) from 192.168.34.3:394= 43: > output error > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 16 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:48,877 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 16 on 62020: exiting > 2009-04-13 15:20:48,877 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@343d8344, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@32750027) from > 192.168.33.236:45479: output error > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 17 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,008 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 17 on 62020: exiting > 2009-04-13 15:20:48,654 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@3ff34fed, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@7f047167) from > 192.168.33.219:40059: output error > 2009-04-13 15:20:48,654 ERROR com.cmri.hugetable.zookeeper.ZNodeWatcher: > processNode /hugetable09/hugetable/acl.lock error!KeeperErrorCode =3D > ConnectionLoss > 2009-04-13 15:20:48,649 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@721d9b81, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@75cc6cae) from > 192.168.33.254:51617: output error > 2009-04-13 15:20:48,649 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 12 on 62020, call batchUpdates([B@655edc27, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from > 192.168.33.238:51231: error: java.io.IOException: Server not running, > aborting > java.io.IOException: Server not running, aborting > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServe= r.java:2809) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2304) > =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =A0 =A0at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java= :39) > =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:= 632) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@3c853cce, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4f5b176c) from > 192.168.33.209:43520: output error > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 4 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,226 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 4 on 62020: exiting > 2009-04-13 15:20:48,648 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@3509aa7f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@d98930d) from 192.168.33.234:44= 367: > output error > 2009-04-13 15:20:48,647 INFO org.mortbay.util.ThreadedServer: Stopping > Acceptor ServerSocket[addr=3D0.0.0.0/0.0.0.0,port=3D0,localport=3D62030] > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 9 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,266 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 9 on 62020: exiting > 2009-04-13 15:20:48,646 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 2 on 62020, call batchUpdates([B@2cc91b6, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from > 192.168.33.210:44154: error: java.io.IOException: Server not running, > aborting > java.io.IOException: Server not running, aborting > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServe= r.java:2809) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2304) > =A0 =A0at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > =A0 =A0at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java= :39) > =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:= 632) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:48,572 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@e8136e0, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@4539b390) from > 192.168.33.217:60476: output error > 2009-04-13 15:20:49,272 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@2cc91b6, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@44724529) from > 192.168.33.210:44154: output error > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 8 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,272 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 8 on 62020: exiting > 2009-04-13 15:20:49,263 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@655edc27, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@36c7b86f) from > 192.168.33.238:51231: output error > 2009-04-13 15:20:49,225 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 1 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,068 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 14 on 62020 caught: java.nio.channels.ClosedByInterruptException > =A0 =A0at > java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterrupti= bleChannel.java:184) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:341) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > 2009-04-13 15:20:49,345 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 14 on 62020: exiting > 2009-04-13 15:20:49,048 ERROR > org.apache.hadoop.hbase.regionserver.HRegionServer: > java.lang.OutOfMemoryError: Java heap space > 2009-04-13 15:20:49,484 FATAL > org.apache.hadoop.hbase.regionserver.HRegionServer: OutOfMemoryError, > aborting. > java.lang.OutOfMemoryError: Java heap space > =A0 =A0at > java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.= java:1187) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(= HRegionServer.java:2863) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemor= y(MemcacheFlusher.java:260) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2307) > =A0 =A0at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:= 632) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > 2009-04-13 15:20:49,488 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=3D0, regions=3D121, stores=3D121, storefiles=3D5188, storefileInd= exSize=3D195, > memcacheSize=3D214, usedHeap=3D4985, maxHeap=3D4991 > 2009-04-13 15:20:49,489 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 15 on 62020, call batchUpdates([B@302bb17f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35= 276: > error: java.io.IOException: java.lang.OutOfMemoryError: Java heap space > java.io.IOException: java.lang.OutOfMemoryError: Java heap space > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(= HRegionServer.java:1334) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(= HRegionServer.java:1324) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2320) > =A0 =A0at sun.reflect.GeneratedMethodAccessor20.invoke(Unknown Source) > =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0at org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:= 632) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:895) > Caused by: java.lang.OutOfMemoryError: Java heap space > =A0 =A0at > java.util.concurrent.ConcurrentHashMap$Values.iterator(ConcurrentHashMap.= java:1187) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.getGlobalMemcacheSize(= HRegionServer.java:2863) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.MemcacheFlusher.reclaimMemcacheMemor= y(MemcacheFlusher.java:260) > =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.batchUpdates(HRegionSe= rver.java:2307) > =A0 =A0... 5 more > 2009-04-13 15:20:49,490 WARN org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > Responder, call batchUpdates([B@302bb17f, > [Lorg.apache.hadoop.hbase.io.BatchUpdate;@492218e) from 192.168.33.235:35= 276: > output error > 2009-04-13 15:20:49,047 INFO org.apache.hadoop.ipc.HBaseServer: Stopping = IPC > Server listener on 62020 > 2009-04-13 15:20:49,493 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 15 on 62020 caught: java.nio.channels.ClosedChannelException > =A0 =A0at > sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) > =A0 =A0at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.channelWrite(HBaseServer.java:108= 5) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer.access$1900(HBaseServer.java:70) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.processResponse(HBaseSe= rver.java:593) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Responder.doRespond(HBaseServer.j= ava:657) > =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:923) > > =A0 =A0Any suggenstion is welcomed! Thanks a lot! >