nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Ibragimov (Jira)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-7437) UI is slow when nifi.analytics.predict.enabled is true
Date Thu, 14 May 2020 09:24:00 GMT

    [ https://issues.apache.org/jira/browse/NIFI-7437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107122#comment-17107122
] 

Dmitry Ibragimov commented on NIFI-7437:
----------------------------------------

[~YolandaMDavis] Our current heap settings is following:
{code:java}
# JVM memory settings
java.arg.2=-Xms12g
java.arg.3=-Xmx12g
java.arg.13=-XX:+UseG1GC
{code}
{code:java}
openjdk version "11.0.7" 2020-04-14 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.7+10-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.7+10-LTS, mixed mode, sharing){code}
We used clustered setup with 3 nodes (16 core, 32 GB memory), and secured installation.

But, I've successfully reproduced this issue with 1 unsecured node (Java8 with OldGen GC also)
with copied our large flow.xml.gz (with more than 4000 processors, but most of them in disabled
state) to this node and enabled property nifi.analytics.predict.enabled.

Reproducibility of this issue is Heap independent - just need to create big bunch of disabled
processors and connect them together.

> UI is slow when nifi.analytics.predict.enabled is true
> ------------------------------------------------------
>
>                 Key: NIFI-7437
>                 URL: https://issues.apache.org/jira/browse/NIFI-7437
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core UI, Extensions
>    Affects Versions: 1.10.0, 1.11.4
>         Environment: Java11, CentOS8
>            Reporter: Dmitry Ibragimov
>            Assignee: Yolanda M. Davis
>            Priority: Critical
>              Labels: features, performance
>
> We faced with issue when nifi.analytics.predict.enabled is true after cluster upgrade
to 1.11.4
> We have about 4000 processors in development enviroment, but most of them is in disabled
state: 256 running, 1263 stopped, 2543 disabled
> After upgrade version from 1.9.2 to 1.11.4 we deicded to test back-pressure prediction
feature and enable it in configuration:
> {code:java}
> nifi.analytics.predict.enabled=true
> nifi.analytics.predict.interval=3 mins
> nifi.analytics.query.interval=5 mins
> nifi.analytics.connection.model.implementation=org.apache.nifi.controller.status.analytics.models.OrdinaryLeastSquares
> nifi.analytics.connection.model.score.name=rSquared
> nifi.analytics.connection.model.score.threshold=.90
> {code}
> And we faced with terrible UI performance degradataion. Root page opens in 20 seconds
instead of 200-500ms. About ~100 times slower. I've tesed it with different environments centos7/8,
java8/11, clustered secured, clustered unsecured, standalone unsecured - all the same.
> In debug log for ThreadPoolRequestReplicator:
> {code:java}
> 2020-05-09 08:03:34,459 DEBUG [Replicate Request Thread-2] o.a.n.c.c.h.r.ThreadPoolRequestReplicator
For GET /nifi-api/flow/process-groups/root (Request ID c144196f-d4cb-4053-8828-70e06f7c5100),
minimum response time = 19548, max = 20625, average = 20161.0 ms
> 2020-05-09 08:03:34,459 DEBUG [Replicate Request Thread-2] o.a.n.c.c.h.r.ThreadPoolRequestReplicator
Node Responses for GET /nifi-api/flow/process-groups/root (Request ID c144196f-d4cb-4053-8828-70e06f7c5100):
> newnifi01:8080: 19548 millis
> newnifi02:8080: 20625 millis
> newnifi03:8080: 20310 millis{code}
> More deep debug:
>  
> {code:java}
> 2020-05-09 10:31:13,252 DEBUG [NiFi Web Server-21] org.eclipse.jetty.server.HttpChannel
REQUEST for //newnifi01:8080/nifi-api/flow/process-groups/root on HttpChannelOverHttp@68d3e945{r=1,c=false,c=false/false,a=IDLE,uri=//newnifi01:8080/nifi-api/flow/process-groups/root,age=0}
> GET //newnifi01:8080/nifi-api/flow/process-groups/root HTTP/1.1
> Host: newnifi01:8080
> ...
> 2020-05-09 10:31:13,256 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for calculating time back pressure by content size in bytes. Returning
-1
> 2020-05-09 10:31:13,257 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for calculating time to back pressure by object count. Returning -1
> 2020-05-09 10:31:13,257 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting content size in bytes for next interval. Returning -1
> 2020-05-09 10:31:13,257 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting object count for next interval. Returning -1
> 2020-05-09 10:31:13,258 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting object count for next interval. Returning -1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting content size in bytes for next interval. Returning -1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: nextIntervalPercentageUseCount=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: nextIntervalBytes=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: timeToBytesBackpressureMillis=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: nextIntervalCount=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: nextIntervalPercentageUseBytes=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: intervalTimeMillis=180000
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id eb602b2a-016f-1000-0000-00002767192a: timeToCountBackpressureMillis=-1
> 2020-05-09 10:31:13,259 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.CachingConnectionStatusAnalyticsEngine
Pulled existing analytics from cache for connection id: ec014ca8-a82b-10bb-0000-00004069f95e
> 2020-05-09 10:31:13,260 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for calculating time back pressure by content size in bytes. Returning
-1
> 2020-05-09 10:31:13,261 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for calculating time to back pressure by object count. Returning -1
> 2020-05-09 10:31:13,261 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting content size in bytes for next interval. Returning -1
> 2020-05-09 10:31:13,261 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting object count for next interval. Returning -1
> 2020-05-09 10:31:13,262 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting object count for next interval. Returning -1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for predicting content size in bytes for next interval. Returning -1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: nextIntervalPercentageUseCount=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: nextIntervalBytes=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: timeToBytesBackpressureMillis=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: nextIntervalCount=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: nextIntervalPercentageUseBytes=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: intervalTimeMillis=180000
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Prediction model for connection id ec014ca8-a82b-10bb-0000-00004069f95e: timeToCountBackpressureMillis=-1
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.CachingConnectionStatusAnalyticsEngine
Pulled existing analytics from cache for connection id: eb61f002-016f-1000-0000-000067a860ea
> 2020-05-09 10:31:13,263 DEBUG [NiFi Web Server-21] o.a.n.c.s.a.ConnectionStatusAnalytics
Model is not valid for calculating time back pressure by content size in bytes. Returning
-1
> ...
> A lot of same messages
> ...
> 2020-05-09 10:31:32,758 DEBUG [NiFi Web Server-21] org.eclipse.jetty.server.HttpChannel
COMMIT for /nifi-api/flow/process-groups/root on HttpChannelOverHttp@68d3e945{r=1,c=true,c=false/false,a=DISPATCHED,uri=//newnifi01:8080/nifi-api/flow/process-groups/root,age=19506}
> 200 OK HTTP/1.1
> Date: Sat, 09 May 2020 07:31:13 GMT
> {code}
> 3ms prediction is a quite fast, but if calculate count this messages it is the same as
number of processors:
> {code:java}
> #zgrep "2020-05-09 10:31:" /var/log/nifi/nifi-app_2020-05-09_10.1.log.gz | grep "Prediction
model for connection id" | cut -d' ' -f 13 | sort | uniq -c | wc -l
> 4842{code}
> I've checked several random connection id's - it is from and to disabled processors.
>  Maybe need to skip prediction back-pressure for disbled connections? Any suppouse, how
can we fix it without disable prediction entirely?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message