hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <saint....@gmail.com>
Subject Re: [jira] Resolved: (HBASE-1972) Failed split results in closed region and non-registration of daughters; fix the order in which things are run
Date Sat, 12 Dec 2009 21:01:49 GMT
So we think this critical to hbase?
Stack



On Dec 12, 2009, at 12:43 PM, Andrew Purtell <apurtell@apache.org>  
wrote:

> All HBase committers should jump on that issue and +1. We should  
> make that kind of statement for the record.
>
>
>
>
> ________________________________
> From: stack (JIRA) <jira@apache.org>
> To: hbase-dev@hadoop.apache.org
> Sent: Sat, December 12, 2009 12:39:18 PM
> Subject: [jira] Resolved: (HBASE-1972) Failed split results in  
> closed region and non-registration of daughters; fix the order in  
> which things are run
>
>
>     [ https://issues.apache.org/jira/browse/HBASE-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

>  ]
>
> stack resolved HBASE-1972.
> --------------------------
>
>    Resolution: Won't Fix
>
> Marking as invalid addressed by hdfs-630. Thanks for looking at this  
> cosmin.  Want to open an issue on getting 630 into 0.21.   There  
> will be pushback I'd imagine since not "critical" but might make  
> 0.21.1
>
>> Failed split results in closed region and non-registration of  
>> daughters; fix the order in which things are run
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --- 
>> --------------------------------------------------------------------
>>
>>                Key: HBASE-1972
>>                URL: https://issues.apache.org/jira/browse/HBASE-1972
>>            Project: Hadoop HBase
>>         Issue Type: Bug
>>           Reporter: stack
>>           Priority: Blocker
>>            Fix For: 0.21.0
>>
>>
>> As part of a split, we go to close the region.  The close fails  
>> because flush failed -- a DN was down and HDFS refuses to move past  
>> it -- so we jump up out of the close with an IOE.  But the region  
>> has been closed yet its still in the .META. as online.
>> Here is where the hole is:
>> 1. CompactSplitThread calls split.
>> 2. This calls HRegion splitRegion.
>> 3. splitRegion calls close(false).
>> 4. Down the end of the close, we get as far as the LOG.info("Closed  
>> " + this)..... but a DFSClient running thread throws an exception  
>> because it can't allocate block for the flush made as part of the  
>> close (Ain't sure how... we should add more try/catch in here):
>> {code}
>> 2009-11-12 00:47:17,865 [regionserver/ 
>> 208.76.44.142:60020.compactor] DEBUG  
>> org.apache.hadoop.hbase.regionserver.Store: Added hdfs:// 
>> aa0-000-12.u.powerset.com:9002/hbase/TestTable/868626151/info/ 
>> 5071349140567656566, entries=46975, sequenceid=2350017,  
>> memsize=52.0m, filesize=46.5m to TestTable,,1257986664542
>> 2009-11-12 00:47:17,866 [regionserver/ 
>> 208.76.44.142:60020.compactor] DEBUG  
>> org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore  
>> flush of ~52.0m for region TestTable,,1257986664542 in 7985ms,  
>> sequence id=2350017, compaction requested=false
>> 2009-11-12 00:47:17,866 [regionserver/ 
>> 208.76.44.142:60020.compactor] DEBUG  
>> org.apache.hadoop.hbase.regionserver.Store: closed info
>> 2009-11-12 00:47:17,866 [regionserver/ 
>> 208.76.44.142:60020.compactor] INFO  
>> org.apache.hadoop.hbase.regionserver.HRegion: Closed TestTable,,1257986664542
>> 2009-11-12 00:47:17,906 [Thread-315] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:17,906 [Thread-315] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_1351692500502810095_1391
>> 2009-11-12 00:47:23,918 [Thread-315] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:23,918 [Thread-315] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_-3310646336307339512_1391
>> 2009-11-12 00:47:29,982 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:29,982 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_3070440586900692765_1393
>> 2009-11-12 00:47:35,997 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:35,997 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_-5656011219762164043_1393
>> 2009-11-12 00:47:42,007 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:42,007 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_-2359634393837722978_1393
>> 2009-11-12 00:47:48,017 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Exception in  
>> createBlockOutputStream java.io.IOException: Bad connect ack with  
>> firstBadLink as 208.76.44.140:51010
>> 2009-11-12 00:47:48,017 [Thread-318] INFO  
>> org.apache.hadoop.hdfs.DFSClient: Abandoning block  
>> blk_-1626727145091780831_1393
>> 2009-11-12 00:47:54,022 [Thread-318] WARN  
>> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception:  
>> java.io.IOException: Unable to create new block.
>>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
>> $DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
>>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
>> $DataStreamer.run(DFSClient.java:2681)
>> 2009-11-12 00:47:54,022 [Thread-318] WARN  
>> org.apache.hadoop.hdfs.DFSClient: Could not get block locations.  
>> Source file "/hbase/TestTable/868626151/splits/1211221550/info/ 
>> 5071349140567656566.868626151" - Aborting...
>> 2009-11-12 00:47:54,029 [regionserver/ 
>> 208.76.44.142:60020.compactor] ERROR  
>> org.apache.hadoop.hbase.regionserver.CompactSplitThread: Compaction/ 
>> Split failed for region TestTable,,1257986664542
>> java.io.IOException: Bad connect ack with firstBadLink as  
>> 208.76.44.140:51010
>>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
>> $DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
>> $DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>>        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream 
>> $DataStreamer.run(DFSClient.java:2681)
>> {code}
>> Marking this as blocker.
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>

Mime
View raw message