[ https://issues.apache.org/jira/browse/NIFI-527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Payne updated NIFI-527:
----------------------------
Attachment: 0004-NIFI-527-Added-unit-test-to-verify-backpressure.patch
0003-NIFI-527-Cleaned-up-log-messages.patch
0002-NIFI-527-More-performance-improvements-including-reu.patch
0001-NIFI-527-Refactored-the-serialization-format-of-the-.patch
> Persistent Prov Repo should compress write-ahead-log files in chunks
> --------------------------------------------------------------------
>
> Key: NIFI-527
> URL: https://issues.apache.org/jira/browse/NIFI-527
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 0.1.0
>
> Attachments: 0001-NIFI-527-Refactored-the-serialization-format-of-the-.patch,
0002-NIFI-527-More-performance-improvements-including-reu.patch, 0003-NIFI-527-Cleaned-up-log-messages.patch,
0004-NIFI-527-Added-unit-test-to-verify-backpressure.patch
>
>
> Currently when we rollover a prov log, we compress the entire thing. This means that
when we want to jump to a particular offset we have to open a GZIPInputStream and read through
all of the data. If we instead compress the logs in chunks, we can actually jump to a particular
chunk using FileInputStream.skip and then open a GZIPInputStream from there. Currently, this
is by far the biggest bottleneck in the prov repo when doing queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|