Return-Path: Delivered-To: apmail-hadoop-hbase-dev-archive@minotaur.apache.org Received: (qmail 41039 invoked from network); 13 Dec 2009 08:46:27 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Dec 2009 08:46:27 -0000 Received: (qmail 59916 invoked by uid 500); 13 Dec 2009 08:46:26 -0000 Delivered-To: apmail-hadoop-hbase-dev-archive@hadoop.apache.org Received: (qmail 59854 invoked by uid 500); 13 Dec 2009 08:46:26 -0000 Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-dev@hadoop.apache.org Delivered-To: mailing list hbase-dev@hadoop.apache.org Received: (qmail 59844 invoked by uid 99); 13 Dec 2009 08:46:26 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Dec 2009 08:46:26 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00,WEIRD_PORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ryanobjc@gmail.com designates 209.85.160.50 as permitted sender) Received: from [209.85.160.50] (HELO mail-pw0-f50.google.com) (209.85.160.50) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Dec 2009 08:46:23 +0000 Received: by pwi20 with SMTP id 20so1528638pwi.29 for ; Sun, 13 Dec 2009 00:46:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=gdAYbw+vkCIQ65cSdSG8gRVlwf6TRLHXXespukdnZjY=; b=C0x1YgIba1WesEQ0/+eGhPi9a6V/2HdLbEvfQ8g5SyA0nUQkv3iADdAOrGWCmqB3Oz OZ5YKeBWlERhQRHWrCAEqTXSBimXN6lqHD90ov1AQdTvVWV7ESm2uUu6weXHu0uqDuo7 MU9mVnf2ncUnEG+UQc8PLfYIPnReDUNyfPNWQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=mYwaN5gKpzt6FbsntiaB2EOTOKw2KLO6JyfHVyNeS1jRp4OgKTsrF9pqedxD1m/UuZ dlAqJ81Wf9R+D8xUW3qUzYsv6HZ1yp5WPiQCyTcQOjQMLSD4xLO8xElPborlulEzK6Sr OIScJv/uTV+Oz9A5M8H0c1A8uDZiLtWLHlkQg= MIME-Version: 1.0 Received: by 10.114.236.28 with SMTP id j28mr2191063wah.162.1260693962709; Sun, 13 Dec 2009 00:46:02 -0800 (PST) In-Reply-To: <61770b880912130016p2548c5f7if25aa56d535ed38@mail.gmail.com> References: <1209343524.1260650358084.JavaMail.jira@brutus> <257717.71270.qm@web65507.mail.ac4.yahoo.com> <3E9905C7-71C7-424C-90EE-D3C5523E105D@gmail.com> <258467.51174.qm@web65505.mail.ac4.yahoo.com> <7c962aed0912121603y26aad0e7hc9ecd2431d13f6ac@mail.gmail.com> <61770b880912130016p2548c5f7if25aa56d535ed38@mail.gmail.com> Date: Sun, 13 Dec 2009 00:46:02 -0800 Message-ID: <78568af10912130046h5509828fk7a0507b4121be8ef@mail.gmail.com> Subject: Re: [jira] Resolved: (HBASE-1972) Failed split results in closed region and non-registration of daughters; fix the order in which things are run From: Ryan Rawson To: hbase-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable I have an idea on how to make splits suck less.... What we do is this on split: - do the split in RS, reopen regions w/o reassigning them - report split to META Also fix the way regionserver works, and dont depend on region name to do put/get. If the put is for a region this RS controls, then just do it, even if the client has the 'wrong' RS. While you are at it, send a OOB or some other signal that the client needs to reload those regions (or not even, there is no 'failure' here). This would require changing a bunch of things, but we shouldnt need to go down just because of a split. On Sun, Dec 13, 2009 at 12:16 AM, Lars George wrote= : > Just as a note, I think I had the same issue. This is on my 7+1 cluster > during a MR import job: > > 2009-12-08 01:15:45,772 DEBUG org.apache.hadoop.hbase.regionserver.HRegio= n: > Flush requested on ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643 > 2009-12-08 01:15:45,772 DEBUG org.apache.hadoop.hbase.regionserver.HRegio= n: > Started memstore flush for region > ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643. Current region > memstore size 64.2m > 2009-12-08 01:15:57,409 WARN org.apache.hadoop.hdfs.DFSClient: DataStream= er > Exception: java.net.SocketTimeoutException: 10000 millis timeout while > waiting for channel to be ready for write. ch : > java.nio.channels.SocketChannel[connected local=3D/192.168.99.38:51729rem= ote=3D/ > 192.168.99.38:50010] > =A0 =A0 =A0 =A0at > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:1= 64) > =A0 =A0 =A0 =A0at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:14= 6) > =A0 =A0 =A0 =A0at > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:10= 7) > =A0 =A0 =A0 =A0at java.io.BufferedOutputStream.write(BufferedOutputStream= .java:105) > =A0 =A0 =A0 =A0at java.io.DataOutputStream.write(DataOutputStream.java:90= ) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie= nt.java:2290) > > 2009-12-08 01:15:57,409 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:57,410 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,567 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 1 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,569 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,569 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,583 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 2 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,585 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,585 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,591 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 3 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,593 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,593 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,598 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 4 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,600 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,600 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,608 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 5 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Will retry... > 2009-12-08 01:15:58,610 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 bad datanode[0] > 192.168.99.38:50010 > 2009-12-08 01:15:58,610 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 in pipeline > 192.168.99.38:50010, 192.168.99.37:50010: bad datanode 192.168.99.38:5001= 0 > 2009-12-08 01:15:58,615 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.37:50010 failed 6 times. =A0Pipeline was > 192.168.99.38:50010, 192.168.99.37:50010. Marking primary datanode as bad= . > 2009-12-08 01:15:58,625 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 1 times. =A0Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,637 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 2 times. =A0Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,654 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 3 times. =A0Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,668 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 4 times. =A0Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,678 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 5 times. =A0Pipeline was > 192.168.99.38:50010. Will retry... > 2009-12-08 01:15:58,685 WARN org.apache.hadoop.hdfs.DFSClient: Error > Recovery for block blk_2400329754585253075_931440 failed =A0because recov= ery > from primary datanode 192.168.99.38:50010 failed 6 times. =A0Pipeline was > 192.168.99.38:50010. Aborting... > 2009-12-08 01:15:58,685 FATAL > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of hlog > required. Forcing server shutdown > org.apache.hadoop.hbase.DroppedSnapshotException: region: > ma-docs,cb48e6aa06cd2937e095bfefbec7c357,1260256286643 > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.j= ava:946) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:839) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStore= Flusher.java:241) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.MemStoreFlusher.run(MemStoreFlusher.= java:149) > Caused by: java.io.IOException: Error Recovery for block > blk_2400329754585253075_931440 failed =A0because recovery from primary > datanode 192.168.99.38:50010 failed 6 times. =A0Pipeline was > 192.168.99.38:50010. Aborting... > =A0 =A0 =A0 =A0at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFS= Client.java:2584) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.ja= va:2078) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie= nt.java:2241) > 2009-12-08 01:15:58,688 INFO > org.apache.hadoop.hbase.regionserver.HRegionServer: Dump of metrics: > request=3D11.9, regions=3D37, stores=3D296, storefiles=3D697, storefileIn= dexSize=3D66, > memstoreSize=3D1184, usedHeap=3D3319, maxHeap=3D4087, blockCacheSize=3D70= 33392, > blockCacheFree=3D850216848, blockCacheCount=3D0, blockCacheHitRatio=3D0 > 2009-12-08 01:15:58,688 INFO > org.apache.hadoop.hbase.regionserver.MemStoreFlusher: > regionserver/192.168.99.38:60020.cacheFlusher exiting > 2009-12-08 01:15:58,779 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 5 on 60020, call put([B@7237791f, > [Lorg.apache.hadoop.hbase.client.Put;@17f11cce) from 192.168.99.34:34211: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServe= r.java:2351) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java= :1828) > =A0 =A0 =A0 =A0at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Sou= rce) > =A0 =A0 =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0 =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2009-12-08 01:15:58,796 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 13 on 60020, call put([B@569a24a9, > [Lorg.apache.hadoop.hbase.client.Put;@21dcffaa) from 192.168.99.36:42492: > error: java.io.IOException: Server not running, aborting > java.io.IOException: Server not running, aborting > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.checkOpen(HRegionServe= r.java:2351) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java= :1828) > =A0 =A0 =A0 =A0at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Sou= rce) > =A0 =A0 =A0 =A0at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:25) > =A0 =A0 =A0 =A0at java.lang.reflect.Method.invoke(Method.java:597) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:648) > =A0 =A0 =A0 =A0at > org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:915) > 2009-12-08 01:16:00,151 INFO org.apache.hadoop.ipc.HBaseServer: Stopping > server on 60020 > 2009-12-08 01:16:00,151 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 2 on 60020: exiting > 2009-12-08 01:16:00,152 INFO org.apache.hadoop.ipc.HBaseServer: Stopping = IPC > Server Responder > 2009-12-08 01:16:00,152 INFO org.apache.hadoop.ipc.HBaseServer: IPC Serve= r > handler 8 on 60020: exiting > ... > > On Sun, Dec 13, 2009 at 1:03 AM, stack wrote: > >> I wrote hdfs-dev to see how to proceed. =A0We could try running a vote t= o get >> it committed to 0.21. >> St.Ack >> >> >> On Sat, Dec 12, 2009 at 1:37 PM, Andrew Purtell >> wrote: >> >> > I do. I think I saw it just last week with a failure case as follows o= n a >> > small testbed (aren't they all? :-/ ) that some of our devs are workin= g >> > with: >> > >> > - Local RS and datanode are talking >> > >> > - Something happens to the datanode >> > =A0 =A0org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutpu= tStream >> > java.net.SocketTimeoutException: 69000 millis timeout while waiting fo= r >> > channel to be ready for read. ch : java.nio.channels.SocketChannel >> > =A0 =A0 org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: >> > java.io.IOException: Unable to create new block. >> > >> > - RS won't try talking to other datanodes elsewhere on the cluster >> > =A0 =A0org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_7040605219500907455_6449696 >> > =A0 =A0org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-5367929502764356875_6449620 >> > =A0 =A0org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_7075535856966512941_6449680 >> > =A0 =A0org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_77095304474221514_6449685 >> > >> > - RS goes down >> > =A0 =A0org.apache.hadoop.hbase.regionserver.MemStoreFlusher: Replay of= hlog >> > required. >> > Forcing server shutdown >> > =A0 =A0org.apache.hadoop.hbase.DroppedSnapshotException ... >> > >> > Not a blocker in that the downed RS with working sync in 0.21 won't lo= se >> > data and can be restarted. But, a critical issue because it will be >> > frequently encountered and will cause processes on the cluster to shut >> down. >> > Without some kind of "god" monitor or human intervention eventually th= ere >> > will be insufficient resources to carry all regions. >> > >> > =A0 - Andy >> > >> > >> > >> > >> > ________________________________ >> > From: Stack >> > To: "hbase-dev@hadoop.apache.org" >> > Sent: Sat, December 12, 2009 1:01:49 PM >> > Subject: Re: [jira] Resolved: (HBASE-1972) Failed split results in clo= sed >> > region and non-registration of daughters; fix the order in which thing= s >> are >> > run >> > >> > So we think this critical to hbase? >> > Stack >> > >> > >> > >> > On Dec 12, 2009, at 12:43 PM, Andrew Purtell >> wrote: >> > >> > > All HBase committers should jump on that issue and +1. We should mak= e >> > that kind of statement for the record. >> > > >> > > >> > > >> > > >> > > ________________________________ >> > > From: stack (JIRA) >> > > To: hbase-dev@hadoop.apache.org >> > > Sent: Sat, December 12, 2009 12:39:18 PM >> > > Subject: [jira] Resolved: (HBASE-1972) Failed split results in close= d >> > region and non-registration of daughters; fix the order in which thing= s >> are >> > run >> > > >> > > >> > > =A0 =A0 [ >> > >> https://issues.apache.org/jira/browse/HBASE-1972?page=3Dcom.atlassian.ji= ra.plugin.system.issuetabpanels:all-tabpanel >> ] >> > > >> > > stack resolved HBASE-1972. >> > > -------------------------- >> > > >> > > =A0 =A0Resolution: Won't Fix >> > > >> > > Marking as invalid addressed by hdfs-630. Thanks for looking at this >> > cosmin. =A0Want to open an issue on getting 630 into 0.21. =A0 There w= ill be >> > pushback I'd imagine since not "critical" but might make 0.21.1 >> > > >> > >> Failed split results in closed region and non-registration of >> daughters; >> > fix the order in which things are run >> > >> >> > >> ------------------------------------------------------------------------= -------------------------------------- >> > >> >> > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Key: HBASE-1972 >> > >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0URL: https://issues.apache.org/jira/= browse/HBASE-1972 >> > >> =A0 =A0 =A0 =A0 =A0 =A0Project: Hadoop HBase >> > >> =A0 =A0 =A0 =A0 Issue Type: Bug >> > >> =A0 =A0 =A0 =A0 =A0 Reporter: stack >> > >> =A0 =A0 =A0 =A0 =A0 Priority: Blocker >> > >> =A0 =A0 =A0 =A0 =A0 =A0Fix For: 0.21.0 >> > >> >> > >> >> > >> As part of a split, we go to close the region. =A0The close fails >> because >> > flush failed -- a DN was down and HDFS refuses to move past it -- so w= e >> jump >> > up out of the close with an IOE. =A0But the region has been closed yet= its >> > still in the .META. as online. >> > >> Here is where the hole is: >> > >> 1. CompactSplitThread calls split. >> > >> 2. This calls HRegion splitRegion. >> > >> 3. splitRegion calls close(false). >> > >> 4. Down the end of the close, we get as far as the LOG.info("Closed= " >> + >> > this)..... but a DFSClient running thread throws an exception because = it >> > can't allocate block for the flush made as part of the close (Ain't su= re >> > how... we should add more try/catch in here): >> > >> {code} >> > >> 2009-11-12 00:47:17,865 [regionserver/208.76.44.142:60020.compactor= ] >> > DEBUG org.apache.hadoop.hbase.regionserver.Store: Added hdfs:// >> > >> aa0-000-12.u.powerset.com:9002/hbase/TestTable/868626151/info/5071349140= 567656566 >> , >> > entries=3D46975, sequenceid=3D2350017, memsize=3D52.0m, filesize=3D46.= 5m to >> > TestTable,,1257986664542 >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor= ] >> > DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore >> flush >> > of ~52.0m for region TestTable,,1257986664542 in 7985ms, sequence >> > id=3D2350017, compaction requested=3Dfalse >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor= ] >> > DEBUG org.apache.hadoop.hbase.regionserver.Store: closed info >> > >> 2009-11-12 00:47:17,866 [regionserver/208.76.44.142:60020.compactor= ] >> > INFO org.apache.hadoop.hbase.regionserver.HRegion: Closed >> > TestTable,,1257986664542 >> > >> 2009-11-12 00:47:17,906 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:17,906 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_1351692500502810095_1391 >> > >> 2009-11-12 00:47:23,918 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:23,918 [Thread-315] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-3310646336307339512_1391 >> > >> 2009-11-12 00:47:29,982 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:29,982 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_3070440586900692765_1393 >> > >> 2009-11-12 00:47:35,997 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:35,997 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-5656011219762164043_1393 >> > >> 2009-11-12 00:47:42,007 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:42,007 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-2359634393837722978_1393 >> > >> 2009-11-12 00:47:48,017 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >> > java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> 2009-11-12 00:47:48,017 [Thread-318] INFO >> > org.apache.hadoop.hdfs.DFSClient: Abandoning block >> > blk_-1626727145091780831_1393 >> > >> 2009-11-12 00:47:54,022 [Thread-318] WARN >> > org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: >> > java.io.IOException: Unable to create new block. >> > >> =A0 =A0 =A0 =A0at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockO= utputStream(DFSClient.java:3100) >> > >> =A0 =A0 =A0 =A0at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli= ent.java:2681) >> > >> 2009-11-12 00:47:54,022 [Thread-318] WARN >> > org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Sourc= e >> file >> > >> "/hbase/TestTable/868626151/splits/1211221550/info/5071349140567656566.8= 68626151" >> > - Aborting... >> > >> 2009-11-12 00:47:54,029 [regionserver/208.76.44.142:60020.compactor= ] >> > ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread: >> > Compaction/Split failed for region TestTable,,1257986664542 >> > >> java.io.IOException: Bad connect ack with firstBadLink as >> > 208.76.44.140:51010 >> > >> =A0 =A0 =A0 =A0at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBloc= kOutputStream(DFSClient.java:3160) >> > >> =A0 =A0 =A0 =A0at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockO= utputStream(DFSClient.java:3080) >> > >> =A0 =A0 =A0 =A0at >> > >> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSCli= ent.java:2681) >> > >> {code} >> > >> Marking this as blocker. >> > > >> > > --This message is automatically generated by JIRA. >> > > - >> > > You can reply to this email to add a comment to the issue online. >> > > >> > > >> > >> > >> > >> > >> > >> >