nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (Jira)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-6510) Predictive Analytics for NiFi Metrics
Date Mon, 09 Sep 2019 15:38:13 GMT

    [ https://issues.apache.org/jira/browse/NIFI-6510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16925801#comment-16925801
] 

ASF subversion and git services commented on NIFI-6510:
-------------------------------------------------------

Commit 8a8b9c1d086ac41b10647f09bdd7d8174a921de5 in nifi's branch refs/heads/master from Andy
I. Christianson
[ https://gitbox.apache.org/repos/asf?p=nifi.git;h=8a8b9c1 ]

NIFI-6510 - Analytics framework (#3681)

* NIFI-6510 Implement initial analytic engine

* NIFI-6510 Implemented basic linear regression model for queue counts

* NIFI-6510 Initial analytics REST endpoint and supporting objects

* NIFI-6510 Connect the dots for StatusAnalytics -> API

* NIFI-6510 Added poc engine with prediction model caching

(cherry picked from commit e013b91)

DFA-9 - updated logging and corrected logic for checking if not in backpressure

(cherry picked from commit a1f8e70)

* NIFI-6510 Updated objects and interfaces to reflect 4 prediction metrics

(cherry picked from commit 050e0fc)

(cherry picked from commit 9fd365f)

* NIFI-6510 adjustments for interface updates, added call to StandardEventAccess, updated
interface to use connection id

(cherry picked from commit 14854ff)

DFA-9 - reduced snapshot interval to 1 minute

(cherry picked from commit 36abb0a)

* NIFI-6510 Split StatusAnalytics interface into Engine and per-Connection versions

* NIFI-6510 Remove redundant connection prediction interfaces as we can just use ConnectionStatusAnalytics
directly

* NIFI-6510 Revert "DFA-9 Remove redundant connection prediction interfaces as we can just
use ConnectionStatusAnalytics directly"

This reverts commit 5b9fead1471059098c0e98343fb337070f1c75c1.

* NIFI-6510 Added prediction fields for use by UI, still need to be populated

* NIFI-6510 Analytics Framework Introduction (#10)

* DFA-9 - Initial refactor for Status Analytics - created additional interfaces for models,
refactored callers to use StatusAnalytics objects with connection context. Implemented SimpleRegression
model.

DFA-9 - added logging

* DFA-9 - relocated query window to CSA from model, adding the prediction percentages and
time interval

* DFA-9 - checkstyle fixes

* NIFI-6510 Add prediction percent values and predicted interval seconds

(cherry picked from commit e60015d)

* NIFI-6510 Changes to inject flowManager instead of flow controller, also changes to properly
reflect when predictions can be made vs not.

(cherry picked from commit 6fae058)

* NIFI-6510 Added tests for engine

(cherry picked from commit 6d7a13b)

* NIFI-6150 Added tests for connection status analytics class, corrected variable names

(cherry picked from commit 58c7c81)

* NIFI-6150 Make checkstyle happy

(cherry picked from commit b6e35ac)

* NIFI-6150 Fixed NaN check and refactored time prediction. Switched to use non caching engine
for testing

* NIFI-6510 Fixed checkstyle issue in TestConnectionStatusAnalytics

* NIFI-6510 Adjusted interval and incorporated R-squared check

Updates to support multiple variables for features, clearing cached regression model based
on r-squared values

Added ordinary least squares model, which truly uses multivariable regression. Refactor of
interfaces to include more general interface for variate models (that include scoring support).

Ratcheck fixes

Added test for SimpleRegression. Minor fix for OLS model

fixed test errors

fixed checkstyle errors

(cherry picked from commit fab411b)

* NIFI-6510 Added property to nifi.properties - Prediction Interval for connection status
analytics (#11)

* NIFI-6566 - Refactor to decouple model instance from status analytics object. Also allow
configurable model from nifi.properties

NIFI-6566 - changes to allow scoring configurations for model in nifi.properties

NIFI-6566 - added default implementation value to NiFiProperties

NIFI-6566 - correction to default variable name in NiFiProperties, removed unnecessary init
method from ConnectionStatusAnalytics

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3663

* NIFI-6585 - Refactored tests to use mocked models and extract functions.  Added check in
ConnectionStatusAnalytics to confirm expected model by type

* NIFI-6586 - documentation and comments

This closes NIFI-6586

Signed-off-by: Andrew I. Christianson <andy@andyic.org>

* NIFI-6568 - Surface time-to-back-pressure and initial predictions in the UI
* Add multi-line tooltips with detail for connection queue back pressure graphics.
* Add estimated time to back pressure to connections summary table.
* Add back pressure prediction ticks.
* add moment.js to format predicted time to back pressure
* tweak summary table headings to match data displayed. re-order connection summary columns

* NIFI-6568 - Properly sort the min estimated time to back pressure in the connection summary
table. Also added a js doc comment.

* NIFI-6510 - add an enable/disable property for analytics

* NIFI-6510 - documentation updates for enable/disable property

* NIFI-6510 - UI: handle the scenario where backpressure predictions are disabled (#3685)

* NIFI-6510 - admin guide updates to further describe model functionality

* NIFI-6510 - code quality fixes (if statement and constructor)

* NIFI-6510 - log warnings when properties could not be retrieved. fixed incorrect property
retrieval for score threshold

* NIFI-6510 Extract out predictions into their own DTO

* NIFI-6510 Optimize imports

* NIFI-6510 Fix formatting

* NIFI-6510 Optimize imports

* NIFI-6510 Optimize imports

* NIFI-6510 - Notice updates for Commons math and Caffeine

* NIFI-6510 - UI updates to account for minor API changes for back pressure predictions (#3697)

* NIFI-6510 - Fix issue displaying estimated time to back pressure in connection summary table
when only one of the predictions is known.

Signed-off-by: Matthew Burgess <mattyb149@apache.org>

This closes #3705

* NIFI-6510 Rip out useless members

* NIFI-6510 - dto updates to check for -1 value

* NIFI-6510 - checkstyle fix

* NIFI-6510 - rolled back last change and applied minNonNegative method

* NIFI-6510 Rip out useless members


> Predictive Analytics for NiFi Metrics
> -------------------------------------
>
>                 Key: NIFI-6510
>                 URL: https://issues.apache.org/jira/browse/NIFI-6510
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Andrew Christianson
>            Assignee: Yolanda M. Davis
>            Priority: Major
>             Fix For: 1.10.0
>
>          Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> From Yolanda's email to the list:
>  
> {noformat}
> Currently NiFi has lots of metrics available for areas including jvm and flow component
usage (via component status) as well as provenance data which NiFi makes available either
through the UI or reporting tasks (for consumption by other systems). Past discussions in
the community cite users shipping this data to applications such as Prometheus, ELK stacks,
or Ambari metrics for further analysis in order to capture/review performance issues, detect
anomalies, and send alerts or notifications. These systems are efficient in capturing and
helping to analyze these metrics however it requires customization work and knowledge of NiFi
operations to provide meaningful analytics within a flow context.
> In speaking with Matt Burgess and Andy Christianson on this topic we feel that there
is an opportunity to introduce an analytics framework that could provide users reasonable
predictions on key performance indicators for flows, such as back pressure and flow rate,
to help administrators improve operational management of NiFi clusters. This framework could
offer several key features:
> - Provide a flexible internal analytics engine and model api which supports the addition
of or enhancement to onboard models
> - Support integration of remote or cloud based ML models
> - Support both traditional and online (incremental) learning methods
> - Provide support for model caching (perhaps later inclusion into a model repository
or registry)
> - UI enhancements to display prediction information either in existing summary data,
new data visualizations, or directly within the flow/canvas (where applicable)
> For an initial target we thought that back pressure prediction would be a good starting
point for this initiative, given that back pressure detection is a key indicator of flow performance
and many of the metrics currently available would provide enough data points to create a reasonable
performing model. We have some ideas on how this could be achieved however we wanted to discuss
this more with the community to get thoughts about tackling this work, especially if there
are specific use cases or other factors that should be considered.{noformat}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Mime
View raw message