Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 57E3B18429 for ; Mon, 7 Mar 2016 03:23:41 +0000 (UTC) Received: (qmail 28818 invoked by uid 500); 7 Mar 2016 03:23:41 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 28768 invoked by uid 500); 7 Mar 2016 03:23:41 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 28753 invoked by uid 99); 7 Mar 2016 03:23:41 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Mar 2016 03:23:41 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C046C2C1F55 for ; Mon, 7 Mar 2016 03:23:40 +0000 (UTC) Date: Mon, 7 Mar 2016 03:23:40 +0000 (UTC) From: "Phil Yang (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-15408) MiniCluster's master crash on initialization and unittest timeout MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Yang updated HBASE-15408: ------------------------------ Attachment: fail.txt Upload the log and the important part is : 2016-03-04 19:12:24,321 FATAL [10.235.114.28:58993.activeMasterManager] master.HMaster$1(1741): Failed to become active master org.apache.hadoop.hbase.DoNotRetryIOException: Failed after retry of OutOfOrderScannerNextException: was there a rpc timeout? at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:469) at org.apache.hadoop.hbase.client.ClientScanner.nextWithSyncCache(ClientScanner.java:358) at org.apache.hadoop.hbase.client.ClientSimpleScanner.next(ClientSimpleScanner.java:51) at org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:776) at org.apache.hadoop.hbase.MetaTableAccessor.scanMeta(MetaTableAccessor.java:702) at org.apache.hadoop.hbase.MetaTableAccessor.fullScan(MetaTableAccessor.java:238) at org.apache.hadoop.hbase.MetaTableAccessor.fullScanRegions(MetaTableAccessor.java:213) at org.apache.hadoop.hbase.master.AssignmentManager.rebuildUserRegions(AssignmentManager.java:1677) at org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:417) at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:767) at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:190) at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1737) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.hbase.exceptions.OutOfOrderScannerNextException: Expected nextCallSeq: 1 But the nextCallSeq got from client: 0; request=scanner_id: 5 number_of_rows: 100 close_scanner: false next_call_seq: 0 client_handles_partials: true client_handles_heartbeats: true track_scan_metrics: false renew: false at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2613) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:220) at org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:63) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithoutRetries(RpcRetryingCallerImpl.java:185) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:360) at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:334) at org.apache.hadoop.hbase.client.RpcRetryingCallerImpl.callWithRetries(RpcRetryingCallerImpl.java:118) at org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ... 1 more 2016-03-04 19:12:24,327 FATAL [10.235.114.28:58993.activeMasterManager] master.HMaster(2208): Master server abort: loaded coprocessors are: [org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint] > MiniCluster's master crash on initialization and unittest timeout > ----------------------------------------------------------------- > > Key: HBASE-15408 > URL: https://issues.apache.org/jira/browse/HBASE-15408 > Project: HBase > Issue Type: Bug > Reporter: Phil Yang > Attachments: fail.txt > > > These days there are many tests timeout on build.apache.org. I have no log on timeout tests but I find a possible reason: master crash on initialization and minicluster will say "No master found; retry". The crash is caused by OutOfOrderScannerNextException. -- This message was sent by Atlassian JIRA (v6.3.4#6332)