[ https://issues.apache.org/jira/browse/NIFI-527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507116#comment-14507116
]
ASF subversion and git services commented on NIFI-527:
------------------------------------------------------
Commit a1027aeae51826ca22234e373ecd207c66e72ab7 in incubator-nifi's branch refs/heads/improve-prov-performance
from [~markap14]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-nifi.git;h=a1027ae ]
NIFI-527: Refactored the serialization format of the persistent prov repo to use compression
blocks and index them
> Persistent Prov Repo should compress write-ahead-log files in chunks
> --------------------------------------------------------------------
>
> Key: NIFI-527
> URL: https://issues.apache.org/jira/browse/NIFI-527
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Reporter: Mark Payne
> Assignee: Mark Payne
> Fix For: 0.1.0
>
> Attachments: 0001-NIFI-527-Refactored-the-serialization-format-of-the-.patch,
0002-NIFI-527-More-performance-improvements-including-reu.patch, 0003-NIFI-527-Cleaned-up-log-messages.patch,
0004-NIFI-527-Added-unit-test-to-verify-backpressure.patch
>
>
> Currently when we rollover a prov log, we compress the entire thing. This means that
when we want to jump to a particular offset we have to open a GZIPInputStream and read through
all of the data. If we instead compress the logs in chunks, we can actually jump to a particular
chunk using FileInputStream.skip and then open a GZIPInputStream from there. Currently, this
is by far the biggest bottleneck in the prov repo when doing queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
|