Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 83923200D37 for ; Thu, 9 Nov 2017 20:08:08 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 82133160BEF; Thu, 9 Nov 2017 19:08:08 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A29F21609C8 for ; Thu, 9 Nov 2017 20:08:07 +0100 (CET) Received: (qmail 71844 invoked by uid 500); 9 Nov 2017 19:08:06 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 71833 invoked by uid 99); 9 Nov 2017 19:08:06 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Nov 2017 19:08:06 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id F026E1A3E73 for ; Thu, 9 Nov 2017 19:08:05 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -100.002 X-Spam-Level: X-Spam-Status: No, score=-100.002 tagged_above=-999 required=6.31 tests=[RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id Jq8daR7zqWvd for ; Thu, 9 Nov 2017 19:08:04 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id 4CB37617F3 for ; Thu, 9 Nov 2017 19:08:03 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 59AA0E25A9 for ; Thu, 9 Nov 2017 19:08:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 6BC3B2475E for ; Thu, 9 Nov 2017 19:08:01 +0000 (UTC) Date: Thu, 9 Nov 2017 19:08:01 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-17475) Stack overflow in AsyncProcess if retry too much MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Thu, 09 Nov 2017 19:08:08 -0000 [ https://issues.apache.org/jira/browse/HBASE-17475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-17475: ----------------------------------- Fix Version/s: 1.4.0 > Stack overflow in AsyncProcess if retry too much > ------------------------------------------------ > > Key: HBASE-17475 > URL: https://issues.apache.org/jira/browse/HBASE-17475 > Project: HBase > Issue Type: Bug > Components: API, Client > Affects Versions: 2.0.0, 1.4.0 > Reporter: Allan Yang > Assignee: Allan Yang > Labels: trivial > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-17475-branch-1.patch, HBASE-17475-branch-1.v2.patch, HBASE-17475-branch-1.v3.patch, HBASE-17475.patch, HBASE-17475.v2.patch > > > In AsyncProcess, we resubmit the retry task in the same thread > {code} > // run all the runnables > for (Runnable runnable : runnables) { > if ((--actionsRemaining == 0) && reuseThread) { > runnable.run(); > } else { > try { > pool.submit(runnable); > } > ...... > {code} > But, if we retry too much time. soon, stack overflow will occur. This is very common in clusters with Phoenix. Phoenix need to write index table in the normal write path, retry will cause stack overflow exception. > {noformat} > "htable-pool19-t2" #582 daemon prio=5 os_prio=0 tid=0x0000000002687800 nid=0x4a96 waiting on condition [0x00007fe3f6301000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1174) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.sendMultiAction(AsyncProcess.java:977) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.groupAndSendMultiAction(AsyncProcess.java:886) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.resubmit(AsyncProcess.java:1181) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.receiveMultiAction(AsyncProcess.java:1321) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl.access$1200(AsyncProcess.java:575) > at org.apache.hadoop.hbase.client.AsyncProcess$AsyncRequestFutureImpl$SingleServerRequestRunnable.run(AsyncProcess.java:729) > > Too long to present... > {noformat} > We need to control the {{reuseThread}}, if number of retry is too much, then don't reuse the same thread to prevent stack overflow. -- This message was sent by Atlassian JIRA (v6.4.14#64029)