Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4B2B5200C46 for ; Wed, 29 Mar 2017 20:09:46 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 49278160B8A; Wed, 29 Mar 2017 18:09:46 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 69899160B5D for ; Wed, 29 Mar 2017 20:09:45 +0200 (CEST) Received: (qmail 38509 invoked by uid 500); 29 Mar 2017 18:09:44 -0000 Mailing-List: contact dev-help@giraph.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@giraph.apache.org Delivered-To: mailing list dev@giraph.apache.org Received: (qmail 38498 invoked by uid 500); 29 Mar 2017 18:09:44 -0000 Delivered-To: apmail-incubator-giraph-dev@incubator.apache.org Received: (qmail 38495 invoked by uid 99); 29 Mar 2017 18:09:44 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Mar 2017 18:09:44 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 18B9BC0DBE for ; Wed, 29 Mar 2017 18:09:44 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.202 X-Spam-Level: X-Spam-Status: No, score=-99.202 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id mv5vI0yZRx0s for ; Wed, 29 Mar 2017 18:09:43 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 859B85FBA1 for ; Wed, 29 Mar 2017 18:09:42 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D25A6E002F for ; Wed, 29 Mar 2017 18:09:41 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 842A62416A for ; Wed, 29 Mar 2017 18:09:41 +0000 (UTC) Date: Wed, 29 Mar 2017 18:09:41 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: giraph-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (GIRAPH-1137) Remove channel probing from Netty worker thread for credit-based flow-control MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 29 Mar 2017 18:09:46 -0000 [ https://issues.apache.org/jira/browse/GIRAPH-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15947619#comment-15947619 ] ASF GitHub Bot commented on GIRAPH-1137: ---------------------------------------- Github user dlogothetis commented on the issue: https://github.com/apache/giraph/pull/26 @heslami, findbugs reports this: [INFO] [INFO] Synchronization performed on java.util.concurrent.atomic.AtomicInteger in org.apache.giraph.comm.flow_control.CreditBasedFlowControl$2.run() ["org.apache.giraph.comm.flow_control.CreditBasedFlowControl$2"] At CreditBasedFlowControl.java:[lines 237-256] > Remove channel probing from Netty worker thread for credit-based flow-control > ----------------------------------------------------------------------------- > > Key: GIRAPH-1137 > URL: https://issues.apache.org/jira/browse/GIRAPH-1137 > Project: Giraph > Issue Type: Bug > Reporter: Hassan Eslami > Assignee: Hassan Eslami > > In credit-based flow-control, sometimes, client threads (one type of Netty worker threads used in Giraph) try to send requests to other workers. This is bad practice for Netty and can cause Netty to mark the execution as deadlock-prone (an example exception shown below). Client threads should only be responsible for sending ACK/NACK messages in response to requests, and they should do so by reuseing the channel from which they received the request. In the current implementation, client threads may try to send unsent/cached requests in credit-based flow control. Sending such requests should be delegated to other threads. > WARN 2017-03-08 06:06:22,104 [netty-client-worker-3] .... > io.netty.util.concurrent.BlockingOperationException: DefaultChannelPromise@2c455378(incomplete) > at io.netty.util.concurrent.DefaultPromise.checkDeadLock(DefaultPromise.java:383) > at io.netty.channel.DefaultChannelPromise.checkDeadLock(DefaultChannelPromise.java:157) > at io.netty.util.concurrent.DefaultPromise.await0(DefaultPromise.java:343) > at io.netty.util.concurrent.DefaultPromise.await(DefaultPromise.java:259) > at org.apache.giraph.utils.ProgressableUtils$ChannelFutureWaitable.waitFor(ProgressableUtils.java:461) > at org.apache.giraph.utils.ProgressableUtils.waitFor(ProgressableUtils.java:214) > at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:180) > at org.apache.giraph.utils.ProgressableUtils.waitForever(ProgressableUtils.java:165) > at org.apache.giraph.utils.ProgressableUtils.awaitChannelFuture(ProgressableUtils.java:132) > at org.apache.giraph.comm.netty.NettyClient.getNextChannel(NettyClient.java:715) > at org.apache.giraph.comm.netty.NettyClient.writeRequestToChannel(NettyClient.java:799) > at org.apache.giraph.comm.netty.NettyClient.doSend(NettyClient.java:789) > at org.apache.giraph.comm.flow_control.CreditBasedFlowControl.trySendCachedRequests(CreditBasedFlowControl.java:515) > at org.apache.giraph.comm.flow_control.CreditBasedFlowControl.messageAckReceived(CreditBasedFlowControl.java:485) > at org.apache.giraph.comm.netty.NettyClient.messageReceived(NettyClient.java:840) > at org.apache.giraph.comm.netty.handler.ResponseClientHandler.channelRead(ResponseClientHandler.java:87) > at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) > at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) > at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:153) > at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) > at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) > at org.apache.giraph.comm.netty.InboundByteCounter.channelRead(InboundByteCounter.java:89) > at io.netty.channel.DefaultChannelHandlerContext.invokeChannelRead(DefaultChannelHandlerContext.java:338) > at io.netty.channel.DefaultChannelHandlerContext.fireChannelRead(DefaultChannelHandlerContext.java:324) > at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:785) > at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:126) > at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:485) > at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:452) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:346) > at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:101) > at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.15#6346)