hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1050) Allow regions to split around scanners
Date Sun, 14 Dec 2008 12:38:46 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656384#action_12656384
] 

Andrew Purtell commented on HBASE-1050:
---------------------------------------

I've recently seen a failure case where this issue caused enough data to build up in the region
that when the compaction/split finally happened, it pushed DFS over the edge and file errors
occurred. It seems the failure happened after HBase thought the new mapfile for one of the
splits was fully written, but DFS disagreed (a block was corrupted and lost before replication
as best as I can tell), and later reality caught up when the region was reassigned during
a rebalance. The region was lost, and thus the test table, which had 978 regions at the time.
I was deliberately stressing HBase and thus DFS at the time with 100 concurrent Heritrix crawler
threads. 

I'll try to take up this issue upon my return from China if it is not otherwise being actively
worked on by then.

> Allow regions to split around scanners
> --------------------------------------
>
>                 Key: HBASE-1050
>                 URL: https://issues.apache.org/jira/browse/HBASE-1050
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>            Reporter: Andrew Purtell
>            Priority: Blocker
>             Fix For: 0.20.0
>
>
> We have a number of scanners iterating over a table that also sees a lot of constant
write activity. If the scans are too frequent we will suppress splitting. At a lull then a
number of splits happen all at once, occasionally overwhelming DFS and causing file corruption.

> I wonder how much work it would be to split regions around scanners. Rather than wait
for scanner leases to expire, suspend/block the scanner, split the table, and then negotiate
with the client to continue. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message