nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joe Witt (Jira)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-6736) If not given enough threads, Load Balanced Connections may block for long periods of time without making progress
Date Wed, 02 Oct 2019 01:07:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-6736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942411#comment-16942411
] 

Joe Witt commented on NIFI-6736:
--------------------------------

this looks important to pull into 1.10.0.  Will eyeball and help review/merge if nobody else
takes it first

> If not given enough threads, Load Balanced Connections may block for long periods of
time without making progress
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-6736
>                 URL: https://issues.apache.org/jira/browse/NIFI-6736
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Critical
>             Fix For: 1.10.0
>
>
> When load-balanced connections are used, we have a few different properties that we can
configure. Specifically, the properties with their default values are:
> nifi.cluster.load.balance.connections.per.node=4
> nifi.cluster.load.balance.max.thread.count=8
> nifi.cluster.load.balance.comms.timeout=30 sec
> If the max thread count is below the number of connections per node * number of nodes
in the cluster, everything still works well when there are reasonably high data volumes across
all connections that are load-balanced. However, if one of the connections has a low data
volume, we can get into a situation where the load balanced connections stop pushing data
for some period of time, typically approximately some multiple of the "comms.timeout" property.
> This appears to be due to the fact that the server is using Socket IO and not NIO and
once data has been received, it will check if more data is available. If it does not receive
any indication for some period of time, it will time out. Only then does it add the socket
connection back to a pool of connections to read from. This means that the thread can be stuck,
waiting to receive more data, and blocking any progress from other connections on that thread.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message