nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Payne (Jira)" <>
Subject [jira] [Commented] (NIFI-6559) FlowFile Repo Journal Recovery Should not Fail if External Overflow Files are Missing
Date Mon, 19 Aug 2019 16:59:00 GMT


Mark Payne commented on NIFI-6559:

I don't think we can change something like this from within NiFi or from a separate utility.
It would generate the same effect.

Can you better explain the end goal here? If I remember correctly, this was tied to a mailing
list thread about OutOfMemoryError's. Was the desire to delete these files in order to sacrifice
some of the data but not all? Or to avoid these particular updates because they were known
to be particularly memory-intensive updates? Or something entirely different?

I could imagine perhaps having a utility that might purge data from a particular queue, or
perhaps flowfiles that have attributes that exceed 65 KB or something like that... but we'd
have to be super careful in a situation like that also because we'd have to ensure that we
kept around a Set of all FlowFiles that were removed so that any further updates to those
FlowFiles would not be included, etc.

> FlowFile Repo Journal Recovery Should not Fail if External Overflow Files are Missing
> -------------------------------------------------------------------------------------
>                 Key: NIFI-6559
>                 URL:
>             Project: Apache NiFi
>          Issue Type: Improvement
>          Components: Core Framework
>            Reporter: Peter Wicks
>            Assignee: Peter Wicks
>            Priority: Minor
>          Time Spent: 10m
>  Remaining Estimate: 0h
> When NiFi is journaling the FlowFile repository changes to disk it sometimes writes Overflow
files if it exceeds a certain memory threshold.
> These files are tracked inside of the *.journal files as External File References. If
one of these external file references is deleted or lost the entire journal fails to recover.
> Instead, I feel this should work more like FlowFile's that lose their queue, or Content
in the Content Repository that has lost it's FlowFile.  Log it, and move on.

This message was sent by Atlassian Jira

View raw message