nifi-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NIFI-2681) Avoid caching Provenance Index Searchers
Date Tue, 06 Sep 2016 15:45:20 GMT

    [ https://issues.apache.org/jira/browse/NIFI-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15467730#comment-15467730
] 

ASF GitHub Bot commented on NIFI-2681:
--------------------------------------

Github user bbende commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/958#discussion_r77661605
  
    --- Diff: nifi-nar-bundles/nifi-provenance-repository-bundle/nifi-persistent-provenance-repository/src/main/java/org/apache/nifi/provenance/lucene/CachingIndexManager.java
---
    @@ -0,0 +1,535 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.nifi.provenance.lucene;
    +
    +import java.io.Closeable;
    +import java.io.File;
    +import java.io.IOException;
    +import java.util.ArrayList;
    +import java.util.HashMap;
    +import java.util.Iterator;
    +import java.util.List;
    +import java.util.Map;
    +import java.util.concurrent.atomic.AtomicInteger;
    +import java.util.concurrent.locks.Lock;
    +import java.util.concurrent.locks.ReentrantLock;
    +
    +import org.apache.lucene.analysis.Analyzer;
    +import org.apache.lucene.analysis.standard.StandardAnalyzer;
    +import org.apache.lucene.index.DirectoryReader;
    +import org.apache.lucene.index.IndexWriter;
    +import org.apache.lucene.index.IndexWriterConfig;
    +import org.apache.lucene.search.IndexSearcher;
    +import org.apache.lucene.store.Directory;
    +import org.apache.lucene.store.FSDirectory;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +public class CachingIndexManager implements Closeable, IndexManager {
    +    private static final Logger logger = LoggerFactory.getLogger(CachingIndexManager.class);
    +
    +    private final Lock lock = new ReentrantLock();
    +    private final Map<File, IndexWriterCount> writerCounts = new HashMap<>();
    +    private final Map<File, List<ActiveIndexSearcher>> activeSearchers =
new HashMap<>();
    +
    +
    +    public void removeIndex(final File indexDirectory) {
    +        final File absoluteFile = indexDirectory.getAbsoluteFile();
    +        logger.info("Removing index {}", indexDirectory);
    +
    +        lock.lock();
    +        try {
    +            final IndexWriterCount count = writerCounts.remove(absoluteFile);
    +            if ( count != null ) {
    +                try {
    +                    count.close();
    +                } catch (final IOException ioe) {
    +                    logger.warn("Failed to close Index Writer {} for {}", count.getWriter(),
absoluteFile);
    +                    if ( logger.isDebugEnabled() ) {
    +                        logger.warn("", ioe);
    +                    }
    +                }
    +            }
    +
    +            for ( final List<ActiveIndexSearcher> searcherList : activeSearchers.values()
) {
    --- End diff --
    
    Wouldn't we want to get the List<ActiveIndexSearcher> for the absoluteFile, rather
than every active searcher? 


> Avoid caching Provenance Index Searchers
> ----------------------------------------
>
>                 Key: NIFI-2681
>                 URL: https://issues.apache.org/jira/browse/NIFI-2681
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>            Reporter: Mark Payne
>            Assignee: Mark Payne
>            Priority: Critical
>             Fix For: 1.1.0
>
>
> In NIFI-2600 and NIFI-2452, we addressed two bugs where the Provenance Repository closes
a cached IndexSearcher too soon. The IndexManager keeps the searchers cached in an effort
to offer better performance when performing a Provenance Query. This was done because it was
recommended in the Lucene documentation. However, we occasionally still see nodes crashing
with segfaults due to the Lucene Searching. We should update the Persistent Provenance Repository
to stop caching Index Searchers in order to trade a slight performance improvement for significantly
better reliability.
> Playing around with the idea in order to test it out shows very favorable results. On
a system where I could cause a seg fault almost every time that I ran a large provenance query,
I updated the code to no longer cache the readers and saw perfect stability with no noticeable
performance degradation.
> I will cleanup the code and submit a PR for these changes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message