Mailing-List: contact hbase-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hbase-dev@hadoop.apache.org
Message-ID: <1052731368.1216487371684.JavaMail.jira@brutus>
Date: Sat, 19 Jul 2008 10:09:31 -0700 (PDT)
From: "Andrew Purtell (JIRA)" <jira@apache.org>
To: hbase-dev@hadoop.apache.org
Subject: [jira] Commented: (HBASE-616) " We slept XXXXXX ms, ten times
 longer than scheduled: 3000" happens frequently.
In-Reply-To: <745419318.1210046515724.JavaMail.jira@brutus>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12615023#action_12615023 ] 

Andrew Purtell commented on HBASE-616:
--------------------------------------

I think this issue is related to HBASE-15. DFS transaction timeouts or excessive sleeps are both indications of excessive system load. 

> " We slept XXXXXX ms, ten times longer than scheduled: 3000" happens frequently.
> --------------------------------------------------------------------------------
>
>                 Key: HBASE-616
>                 URL: https://issues.apache.org/jira/browse/HBASE-616
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> Just saw the below in a log... all in a row on the one server.
> {code}
>    4493 2008-05-05 18:08:17,512 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 34557ms, ten times longer than scheduled: 3000
>    4494 2008-05-05 18:11:08,879 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 30576ms, ten times longer than scheduled: 3000
>    4495 2008-05-05 18:30:45,056 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 1091720ms, ten times longer than scheduled: 3000
>    4496 2008-05-05 18:30:45,056 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 1094209ms, ten times longer than scheduled: 10000
>    4497 2008-05-05 18:30:45,429 FATAL org.apache.hadoop.hbase.HRegionServer: unable to report to master for 1092093 milliseconds - aborting server
> {code}
> We're seeing these kinda outages pretty frequently.  In the case above, it was small cluster that was using TableReduce to insert.  The MR, HDFS and HBase were all running on same nodes.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.