Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 74AF518FF3 for ; Mon, 13 Jul 2015 22:35:05 +0000 (UTC) Received: (qmail 23599 invoked by uid 500); 13 Jul 2015 22:35:05 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 23558 invoked by uid 500); 13 Jul 2015 22:35:05 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 23546 invoked by uid 99); 13 Jul 2015 22:35:05 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Jul 2015 22:35:05 +0000 Date: Mon, 13 Jul 2015 22:35:05 +0000 (UTC) From: "Hudson (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-13997) ScannerCallableWithReplicas cause Infinitely blocking MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-13997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14625504#comment-14625504 ] Hudson commented on HBASE-13997: -------------------------------- SUCCESS: Integrated in HBase-1.3 #52 (See [https://builds.apache.org/job/HBase-1.3/52/]) HBASE-13997 ScannerCallableWithReplicas cause Infinitely blocking (Zephyr Guo and Enis) (enis: rev 426bd097775dc9ed18b4f208429eeece0b472e95) * hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientScanner.java * hbase-client/src/main/java/org/apache/hadoop/hbase/client/ScannerCallableWithReplicas.java > ScannerCallableWithReplicas cause Infinitely blocking > ----------------------------------------------------- > > Key: HBASE-13997 > URL: https://issues.apache.org/jira/browse/HBASE-13997 > Project: HBase > Issue Type: Bug > Components: Client > Affects Versions: 1.0.1.1 > Reporter: Zephyr Guo > Assignee: Zephyr Guo > Priority: Minor > Attachments: HBASE-13997.patch, hbase-13997_v2.patch > > > Bug in ScannerCallableWithReplicas.addCallsForOtherReplicas method > {code:title=code in ScannerCallableWithReplicas.addCallsForOtherReplicas |borderStyle=solid} > private int addCallsForOtherReplicas( > BoundedCompletionService> cs, RegionLocations rl, int min, > int max) { > if (scan.getConsistency() == Consistency.STRONG) { > return 0; // not scheduling on other replicas for strong consistency > } > for (int id = min; id <= max; id++) { > if (currentScannerCallable.getHRegionInfo().getReplicaId() == id) { > continue; //this was already scheduled earlier > } > ScannerCallable s = currentScannerCallable.getScannerCallableForReplica(id); > if (this.lastResult != null) { > s.getScan().setStartRow(this.lastResult.getRow()); > } > outstandingCallables.add(s); > RetryingRPC retryingOnReplica = new RetryingRPC(s); > cs.submit(retryingOnReplica); > } > return max - min + 1; //bug? should be "max - min",because "continue" > //always happen once > } > {code} > It can cause completed < submitted always so that the following code will be infinitely blocked. > {code:title=code in ScannerCallableWithReplicas.call|borderStyle=solid} > // submitted larger than the actual one > submitted += addCallsForOtherReplicas(cs, rl, 0, rl.size() - 1); > try { > //here will be affected > while (completed < submitted) { > try { > Future> f = cs.take(); > Pair r = f.get(); > if (r != null && r.getSecond() != null) { > updateCurrentlyServingReplica(r.getSecond(), r.getFirst(), done, pool); > } > return r == null ? null : r.getFirst(); // great we got an answer > } catch (ExecutionException e) { > // if not cancel or interrupt, wait until all RPC's are done > // one of the tasks failed. Save the exception for later. > if (exceptions == null) exceptions = new ArrayList(rl.size()); > exceptions.add(e); > completed++; > } > } > } catch (CancellationException e) { > throw new InterruptedIOException(e.getMessage()); > } catch (InterruptedException e) { > throw new InterruptedIOException(e.getMessage()); > } finally { > // We get there because we were interrupted or because one or more of the > // calls succeeded or failed. In all case, we stop all our tasks. > cs.cancelAll(true); > } > {code} > If all replica-RS occur ExecutionException ,it will be infinitely blocked in cs.take() -- This message was sent by Atlassian JIRA (v6.3.4#6332)