hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "HBase Review Board (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-2461) Split doesn't handle IOExceptions when creating new region reference files
Date Tue, 03 Aug 2010 01:37:17 GMT

    [ https://issues.apache.org/jira/browse/HBASE-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12894775#action_12894775
] 

HBase Review Board commented on HBASE-2461:
-------------------------------------------

Message from: stack@duboce.net

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://review.cloudera.org/r/474/
-----------------------------------------------------------

Review request for hbase.


Summary
-------

Posting this to review board.

Patch that keeps a journal during split transaction.  If split fails, call to rollback will
restore the parent to original open condition by backing up whatever transaction steps completed.

The transaction spans split checks, closing of parent region and creation of daughters up
to the addition of parent offlining to .META.  Once the .META. edit has been made, we cannot
rollback -- we have to go forward.  This means that the basescanner fixup that will add missing
daughter regions should the regionserver crash after parent region edit but before its added
daughters is still required, in some form at least.

This patch includes a test of the new split code but only run against an HRegion, not in server
context.  The split code is buried in heart of the regionserver and created on startup.  I
stared at it for a while and injecting fault was just forbidding.  Its like bramble; there
are so many spikes in the way of getting your finger down into the running split I ended up
passing on it.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java
(split): Most of the split code has been moved out to the new SplitTransaction class.
Now this method prepares the split transaction, executes, and if failure does rollback.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
(splitLock) Removed. Doesn't seem necessary.  Just  made close method synchronized.
(SPLITDIR) Moved to new SplitTransaction
Moved cleanup of half-done splits into SplitTransaction.  It'll know better how to do this.
Moved split code into SplitTransaction class.

+ M src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java
Made this class implement new OnlineRegions interface

+ A src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java
New Interface that allows you add/remove regions from oline regions.  This Interface
adds little.  Was just trying to make it so I didn't have to have server context doing
tests but in the end I just passed null for the case of no server context.  Could remove
this.

+ A src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java
New class that encapsulates all to do w/ splitting "transaction".

+ A src/main/java/org/apache/hadoop/hbase/util/PairOfSameType.java
Minor utility class

+M src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
(loadRegion) Added loading a region

+ M src/test/java/org/apache/hadoop/hbase/io/TestImmutableBytesWritable.java
(testSpecificCompare) Unrelated change

+ M src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java
Change because of new manner in which splits are run.  Added a splitRegions method.

+ A src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java
Test of region splitting code in region context.  Testing in server context would take
a bunch of work making it so could insert mock instance of SplitTransaction.


This addresses bug HBASE-2461.
    http://issues.apache.org/jira/browse/HBASE-2461


Diffs
-----

  src/main/java/org/apache/hadoop/hbase/regionserver/CompactSplitThread.java 7589db3 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 6dc41a4 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 6a54736 
  src/main/java/org/apache/hadoop/hbase/regionserver/OnlineRegions.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitTransaction.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/util/PairOfSameType.java PRE-CREATION 
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 4d09fe9 
  src/test/java/org/apache/hadoop/hbase/io/TestImmutableBytesWritable.java 43fa6dd 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestHRegion.java 98bd3e5 
  src/test/java/org/apache/hadoop/hbase/regionserver/TestSplitTransaction.java PRE-CREATION


Diff: http://review.cloudera.org/r/474/diff


Testing
-------

Basic unit tests are passing.


Thanks,

stack




> Split doesn't handle IOExceptions when creating new region reference files
> --------------------------------------------------------------------------
>
>                 Key: HBASE-2461
>                 URL: https://issues.apache.org/jira/browse/HBASE-2461
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>            Reporter: Todd Lipcon
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 0.90.0
>
>         Attachments: 2461-v2.txt, 2461-v3.txt, 2461-v4.txt, 2461-v6.txt, 2461-v7.txt,
2461.txt
>
>
> I was testing an HDFS patch which had a bug in it, so it happened to throw an NPE during
a split with the following trace:
> 2010-04-16 19:18:20,727 ERROR org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction failed for region TestTable,-1945465867<1271449232310>,1271453785648
> java.lang.NullPointerException
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.enqueueCurrentPacket(DFSClient.java:3124)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.flushInternal(DFSClient.java:3220)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.closeInternal(DFSClient.java:3306)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.close(DFSClient.java:3255)
>         at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:61)
>         at org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:86)
>         at org.apache.hadoop.fs.FileSystem.createNewFile(FileSystem.java:560)
>         at org.apache.hadoop.hbase.util.FSUtils.create(FSUtils.java:95)
>         at org.apache.hadoop.hbase.io.Reference.write(Reference.java:129)
>         at org.apache.hadoop.hbase.regionserver.StoreFile.split(StoreFile.java:498)
>         at org.apache.hadoop.hbase.regionserver.HRegion.splitRegion(HRegion.java:682)
>         at org.apache.hadoop.hbase.regionserver.CompactSplitThread.split(CompactSplitThread.java:162)
>         at org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSplitThread.java:95)
> After that, my region was gone, any further writes to it would fail.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message