hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-866) Blocking for ten minutes at a time
Date Thu, 04 Sep 2008 00:05:44 GMT

    [ https://issues.apache.org/jira/browse/HBASE-866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12628208#action_12628208
] 

stack commented on HBASE-866:
-----------------------------

Here is another example:

{code}
2008-09-03 23:35:54,899 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0134223832,1220484926984
2008-09-03 23:35:54,899 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache
flush for region TestTable,0134223832,1220484926984. Current region memcache size 64.0m
2008-09-03 23:35:54,899 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 10 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:35:54,901 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 10 on 60020'
2008-09-03 23:36:05,287 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 1 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,287 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 8 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,287 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 0 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,287 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 2 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,288 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 10 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,288 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 9 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,288 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 7 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,288 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 12 on 60020' on region TestTable,0134223832,1220484926984: Memcache
size 64.0m is >= than blocking 64.0m size
2008-09-03 23:36:05,650 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache
flush for region TestTable,0134223832,1220484926984 in 10751ms, sequence id=12850777, compaction
requested=false
2008-09-03 23:46:00,731 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache
flush for region TestTable,0134223832,1220484926984. Current region memcache size 64.0m
2008-09-03 23:46:00,731 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 9 on 60020'
2008-09-03 23:46:00,732 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 10 on 60020'
2008-09-03 23:46:00,732 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 12 on 60020'
2008-09-03 23:46:00,732 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 0 on 60020'
2008-09-03 23:46:00,741 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 8 on 60020'
2008-09-03 23:46:00,741 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 1 on 60020'
2008-09-03 23:46:00,741 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 7 on 60020'
2008-09-03 23:46:00,742 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking updates
for region TestTable,0134223832,1220484926984 'IPC Server handler 2 on 60020'
2008-09-03 23:46:04,123 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished memcache
flush for region TestTable,0134223832,1220484926984 in 3392ms, sequence id=12916127, compaction
requested=true
{code}

Running the above, I'd moved notification of blocking threads to just after clear of memcache
rather than later down in the flush but didn't seem to matter:
{code}
Index: src/java/org/apache/hadoop/hbase/regionserver/HRegion.java
===================================================================
--- src/java/org/apache/hadoop/hbase/regionserver/HRegion.java  (revision 691805)
+++ src/java/org/apache/hadoop/hbase/regionserver/HRegion.java  (working copy)
@@ -1042,6 +1042,10 @@
     } finally {
       this.updatesLock.writeLock().unlock();
     }
+    // Notify anyone waiting on memcache to clear: e.g. checkResources().
+    synchronized (this) {
+      notifyAll();
+    }
 
     // Any failure from here on out will be catastrophic requiring server
     // restart so hlog content can be replayed and put back into the memcache.
@@ -1081,13 +1085,6 @@
     //     log-sequence-ids can be safely ignored.
     this.log.completeCacheFlush(getRegionName(),
         regionInfo.getTableDesc().getName(), completeSequenceId);
-
-    // C. Finally notify anyone waiting on memcache to clear:
-    // e.g. checkResources().
-    synchronized (this) {
-      notifyAll();
-    }
-    
     if (LOG.isDebugEnabled()) {
       String timeTaken = StringUtils.formatTimeDiff(System.currentTimeMillis(), 
           startTime);
{code}


> Blocking for ten minutes at a time
> ----------------------------------
>
>                 Key: HBASE-866
>                 URL: https://issues.apache.org/jira/browse/HBASE-866
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> I've been testing running biggish MR jobs uploading into hbase.  My jobs consistently
fail with child task timing out its ten minute period.  Adding logging, was able to see that
we're actual stuck in a commit.  Following the thread of the row we're committing, I see this
in the log:
> {code}
> ...
> 2008-09-03 18:37:03,446 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Flush requested
on TestTable,0029377106,1220466998108
> 2008-09-03 18:37:03,446 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache
flush for region TestTable,0029377106,1220466998108. Current region memcache size 64.0m
> 2008-09-03 18:37:03,446 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 1 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:13,450 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 1 on 60020'
> 2008-09-03 18:37:16,089 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 16 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 1 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 4 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 6 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 2 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 12 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,090 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 9 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:16,091 INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates
for 'IPC Server handler 7 on 60020' on region TestTable,0029377106,1220466998108: Memcache
size 64.0m is >= than blocking 64.0m size
> 2008-09-03 18:37:21,984 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished
memcache flush for region TestTable,0029377106,1220466998108 in 18538ms, sequence id=2852547,
compaction requested=false
> 2008-09-03 18:47:06,241 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Started memcache
flush for region TestTable,0029377106,1220466998108. Current region memcache size 64.0m
> 2008-09-03 18:47:10,031 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Finished
memcache flush for region TestTable,0029377106,1220466998108 in 3790ms, sequence id=2919208,
compaction requested=true
> 2008-09-03 18:47:10,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 9 on 60020'
> 2008-09-03 18:47:10,031 DEBUG org.apache.hadoop.hbase.regionserver.CompactSplitThread:
Compaction requested for region: TestTable,0029377106,1220466998108
> 2008-09-03 18:47:10,031 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 12 on 60020'
> 2008-09-03 18:47:10,032 INFO org.apache.hadoop.hbase.regionserver.HRegion: starting compaction
on region TestTable,0029377106,1220466998108
> 2008-09-03 18:47:10,032 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 7 on 60020'
> 2008-09-03 18:47:10,035 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 6 on 60020'
> 2008-09-03 18:47:10,035 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 4 on 60020'
> 2008-09-03 18:47:10,035 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 2 on 60020'
> 2008-09-03 18:47:10,037 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 16 on 60020'
> 2008-09-03 18:47:10,043 INFO org.apache.hadoop.hbase.regionserver.HRegion: Unblocking
updates for region TestTable,0029377106,1220466998108 'IPC Server handler 1 on 60020'
> 2008-09-03 18:47:18,403 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction
completed on region TestTable,0029377106,1220466998108 in 8sec
> ...
> {code}
> Notice how we're blocked for ten minutes until new flush runs.  My guess is that the
flush that is going on concurrent with the blocking is clearing the flag 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message