incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: IndexReaderClosedException...
Date Wed, 01 Jun 2016 11:34:47 GMT
Just one more observation here...

Even if readerPooling is set to true, lucene has 2 readers (One for search
& one updates/deletes)

But the reader for updates/deletes is not opened/closed for every commit
call which is the default behavior as of today. It is opened only once
(During first update/delete call)

On Wed, Jun 1, 2016 at 3:10 PM, Ravikumar Govindarajan <
ravikumar.govindarajan@gmail.com> wrote:

> In newer versions of the code there are multiple streams involved.  One for
>> each open file handle plus if a sequential read is detected a new stream
>> is
>> created for the instance for better performance
>
>
> Great. We just patched up our Blur version with this code.
>
> While I was digging at the reader-closed issue, was quite surprised to
> observe the following behavior
>
>    - Issue a commit
>    - Lucene opens a new reader via IndexWriter. (Doesn't re-use our
>    already opened DirectoryReader)
>    - Processes all updates/deletes/merges
>    - Closes the new reader
>    - Complete commit
>
> For a big index & lots of commits, opening a new-reader for every commit
> is prohibitively expensive.
>
>
> Here is the JIRA for it...
> https://issues.apache.org/jira/browse/LUCENE-2297
>
> All we need to do is just set "readerPooling=true" in IndexWriterConfig
> class
>
> Please do explore this option when you find time.
>
> --
> Ravi
>
>
>
> On Tue, May 24, 2016 at 7:48 PM, Aaron McCurry <amccurry@gmail.com> wrote:
>
>> On Tue, May 24, 2016 at 6:06 AM, Ravikumar Govindarajan <
>> ravikumar.govindarajan@gmail.com> wrote:
>>
>> > We have solved it temporarily by using a KeepLastTwoCommits del policy.
>> We
>> > don't get these exceptions now!!!
>> >
>>
>> Great!
>>
>>
>> >
>> > Btw, I see that pread calls in FSDataInputStream.java are synchronized.
>> Is
>> > it possible that merge DFS read calls could potentially block search DFS
>> > read calls?
>> >
>>
>> Yes.
>>
>>
>> >
>> > Would it be a good idea to have 2 DFSInputStreams for every file, one
>> for
>> > merge & another for search?
>> >
>>
>> In newer versions of the code there are multiple streams involved.  One
>> for
>> each open file handle plus if a sequential read is detected a new stream
>> is
>> created for the instance for better performance.  Checkout the
>> HdfsDirectory class.
>>
>> Aaron
>>
>>
>> >
>> > On Tue, May 10, 2016 at 7:43 PM, Ravikumar Govindarajan <
>> > ravikumar.govindarajan@gmail.com> wrote:
>> >
>> > > Sorry, I mis-understood the code.
>> > > I see that it has 2 locks IndexRefreshWriteLock &
>> IndexRefreshReadLock.
>> > > They look to be separate
>> > >
>> > > On Tue, May 10, 2016 at 7:16 PM, Ravikumar Govindarajan <
>> > > ravikumar.govindarajan@gmail.com> wrote:
>> > >
>> > >> Thanks a lot Aaron.
>> > >>
>> > >> I guess we took a commit of 0.2.2 that doesn't have the
>> > >> IndexRefreshWriteLock (IRWL). It looks like it co-ordinates between
>> > >> searches & incoming mutation commits. If so, then it will likely
>> solve
>> > the
>> > >> first issue for us (AlreadyClosedException)
>> > >>
>> > >>
>> > >> Can you recollect if that was the reason IRWL was introduced?
>> > >>
>> > >> On Tue, May 10, 2016 at 6:40 PM, Aaron McCurry <amccurry@gmail.com>
>> > >> wrote:
>> > >>
>> > >>> On Tue, May 10, 2016 at 2:30 AM, Ravikumar Govindarajan <
>> > >>> ravikumar.govindarajan@gmail.com> wrote:
>> > >>>
>> > >>> > Actually there are 2 issues...
>> > >>> >
>> > >>> > 1. IndexReaderClosedException
>> > >>> > 2. HDFS Stream Closed
>> > >>> >
>> > >>>
>> > >>> Likely when the index is closed it closes the underlying
>> indexinputs as
>> > >>> well causing the HDFS Stream closed exception.
>> > >>>
>> > >>>
>> > >>> >
>> > >>> > Merge completion results in File Deletion & ultimately
HDFS Stream
>> > >>> Closed
>> > >>> > during Search....
>> > >>> >
>> > >>> > I use IndexFileDeleter with KeepOnlyLastCommitDeletionPolicy.
This
>> > >>> blindly
>> > >>> > deletes the file, without bothering to cross-check
>> > >>> IndexReader.RefCount >
>> > >>> > 0.
>> > >>> >
>> > >>>
>> > >>> Hmm.  You can see here:
>> > >>>
>> > >>>
>> > >>>
>> >
>> https://github.com/apache/incubator-blur/blob/release-0.2.2-incubating/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java#L303
>> > >>>
>> > >>> That once the new index is available it is swapped into the index
>> ref
>> > >>> object and the old one is sent to the index closer.  Once the ref
to
>> > the
>> > >>> index are low enough it closes the index.  Or at least it should.
>> > >>>
>> > >>> I will continue looking into the problem but I don't have a solution
>> > for
>> > >>> you yet.
>> > >>>
>> > >>> Aaron
>> > >>>
>> > >>>
>> > >>>
>> > >>> >
>> > >>> >
>> > >>> > *Exception(message:Unknown error during rewrite,
>> > >>> > stackTraceStr:java.io.IOException: Stream closed*
>> > >>> > at
>> > >>>
>> org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1385)
>> > >>> > at
>> > org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374)
>> > >>> > at
>> > >>>
>> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.hdfs.HdfsIndexInput.readInternal(HdfsIndexInput.java:62)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:167)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:122)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.hdfs.MmapCacheIndexInput.readAndcache(MmapCacheIndexInput.java:24)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fillNormally(CacheIndexInput.java:354)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.blockcache_v2.CacheIndexInput.fill(CacheIndexInput.java:379)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.blockcache_v2.CacheIndexInput.tryToFill(CacheIndexInput.java:297)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.store.blockcache_v2.CacheIndexInput.readByte(CacheIndexInput.java:151)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.lucene.warmup.TraceableIndexInput.readByte(TraceableIndexInput.java:62)
>> > >>> > at org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2366)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1949)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.index.ExitableReader$ExitableTermsEnum.seekCeil(ExitableReader.java:250)
>> > >>> > at
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
>> > >>> > at
>> > >>> >
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
>> > >>> > at
>> > >>>
>> > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
>> > >>> > at
>> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>> > >>> > at
>> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>> > >>> > at
>> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>> > >>> > at
>> > >>> >
>> > >>> > On Mon, May 9, 2016 at 4:42 PM, Ravikumar Govindarajan <
>> > >>> > ravikumar.govindarajan@gmail.com> wrote:
>> > >>> >
>> > >>> > > One extra info we gleaned from the logs...
>> > >>> > >
>> > >>> > > 1. Merge Starts & is about to complete
>> > >>> > > 2. Searcher is opened
>> > >>> > > 3. Merge Completes
>> > >>> > > 4. Ref-count drops to 0 in IndexReader
>> > >>> > > 5. IndexReader closed while Searcher is still open
>> > >>> > >
>> > >>> > > This seems to be the main pattern for causing the Exception
>> > >>> > >
>> > >>> > > --
>> > >>> > > Ravi
>> > >>> > >
>> > >>> > > On Mon, May 9, 2016 at 3:08 PM, Ravikumar Govindarajan
<
>> > >>> > > ravikumar.govindarajan@gmail.com> wrote:
>> > >>> > >
>> > >>> > >> Thanks Aaron...
>> > >>> > >>
>> > >>> > >> Just a quick question. Lucene itself has ref-counting
to close
>> > it's
>> > >>> > >> readers no? Or Blur has it's own logic to handle
it?
>> > >>> > >>
>> > >>> > >> --
>> > >>> > >> Ravi
>> > >>> > >>
>> > >>> > >> On Fri, May 6, 2016 at 7:56 PM, Aaron McCurry <
>> amccurry@gmail.com
>> > >
>> > >>> > wrote:
>> > >>> > >>
>> > >>> > >>> Likely yes.  If have a few minutes this weekend
I can look
>> > through
>> > >>> that
>> > >>> > >>> version and see if I can point you in the right
direction.
>> > >>> > >>>
>> > >>> > >>> On Fri, May 6, 2016 at 8:46 AM, Ravikumar Govindarajan
<
>> > >>> > >>> ravikumar.govindarajan@gmail.com> wrote:
>> > >>> > >>>
>> > >>> > >>> > Sometimes during an ongoing search we receive
an
>> > >>> > >>> > IndexReaderClosedException...
>> > >>> > >>> >
>> > >>> > >>> > We are on an older version of Blur (0.2.2).
Has this been
>> fixed
>> > >>> in
>> > >>> > >>> newer
>> > >>> > >>> > versions or we have been using it wrongly?
>> > >>> > >>> >
>> > >>> > >>> >
>> *stackTraceStr:org.apache.lucene.store.AlreadyClosedException:
>> > >>> this
>> > >>> > >>> > IndexReader cannot be used anymore as one
of its child
>> readers
>> > >>> was
>> > >>> > >>> closed*
>> > >>> > >>> > at
>> > >>> >
>> org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257)
>> > >>> > >>> > at
>> > >>> > >>> >
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.index.FilterAtomicReader.fields(FilterAtomicReader.java:380)
>> > >>> > >>> > at
>> > >>> > >>> >
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> >
>> > >>>
>> >
>> org.apache.blur.index.ExitableReader$ExitableFilterAtomicReader.fields(ExitableReader.java:81)
>> > >>> > >>> > at
>> > >>> > >>> >
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:52)
>> > >>> > >>> > at
>> > >>> > >>> >
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
>> > >>> > >>> > at
>> > >>> > >>> >
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> >
>> > >>>
>> >
>> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
>> > >>> > >>> > at
>> > >>> > >>>
>> > >>> >
>> > >>>
>> > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
>> > >>> > >>> > at
>> > >>> >
>> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
>> > >>> > >>> >
>> > >>> > >>>
>> > >>> > >>
>> > >>> > >>
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>
>> > >>
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message