incubator-blur-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ravikumar Govindarajan <ravikumar.govindara...@gmail.com>
Subject Re: IndexReaderClosedException...
Date Thu, 16 Jun 2016 11:03:14 GMT
>
>  I didn't fully understand the underlying Lucene reader, writer,
> open, close semantics


I too don't know the correct behavior. Lucene code is incredibly hairy to
follow... :)

Have pinged lucene mailing list. Hope someone replies...

On Tue, Jun 7, 2016 at 4:46 PM, Aaron McCurry <amccurry@gmail.com> wrote:

> On Wed, Jun 1, 2016 at 7:34 AM, Ravikumar Govindarajan <
> ravikumar.govindarajan@gmail.com> wrote:
>
> > Just one more observation here...
> >
> > Even if readerPooling is set to true, lucene has 2 readers (One for
> search
> > & one updates/deletes)
> >
> > But the reader for updates/deletes is not opened/closed for every commit
> > call which is the default behavior as of today. It is opened only once
> > (During first update/delete call)
> >
>
> I will take a closer look at the code for this one.  Likely when I wrote
> this code I didn't fully understand the underlying Lucene reader, writer,
> open, close semantics.  Thank you for pointing this out!
>
> Aaron
>
>
> >
> > On Wed, Jun 1, 2016 at 3:10 PM, Ravikumar Govindarajan <
> > ravikumar.govindarajan@gmail.com> wrote:
> >
> > > In newer versions of the code there are multiple streams involved.  One
> > for
> > >> each open file handle plus if a sequential read is detected a new
> stream
> > >> is
> > >> created for the instance for better performance
> > >
> > >
> > > Great. We just patched up our Blur version with this code.
> > >
> > > While I was digging at the reader-closed issue, was quite surprised to
> > > observe the following behavior
> > >
> > >    - Issue a commit
> > >    - Lucene opens a new reader via IndexWriter. (Doesn't re-use our
> > >    already opened DirectoryReader)
> > >    - Processes all updates/deletes/merges
> > >    - Closes the new reader
> > >    - Complete commit
> > >
> > > For a big index & lots of commits, opening a new-reader for every
> commit
> > > is prohibitively expensive.
> > >
> > >
> > > Here is the JIRA for it...
> > > https://issues.apache.org/jira/browse/LUCENE-2297
> > >
> > > All we need to do is just set "readerPooling=true" in IndexWriterConfig
> > > class
> > >
> > > Please do explore this option when you find time.
> > >
> > > --
> > > Ravi
> > >
> > >
> > >
> > > On Tue, May 24, 2016 at 7:48 PM, Aaron McCurry <amccurry@gmail.com>
> > wrote:
> > >
> > >> On Tue, May 24, 2016 at 6:06 AM, Ravikumar Govindarajan <
> > >> ravikumar.govindarajan@gmail.com> wrote:
> > >>
> > >> > We have solved it temporarily by using a KeepLastTwoCommits del
> > policy.
> > >> We
> > >> > don't get these exceptions now!!!
> > >> >
> > >>
> > >> Great!
> > >>
> > >>
> > >> >
> > >> > Btw, I see that pread calls in FSDataInputStream.java are
> > synchronized.
> > >> Is
> > >> > it possible that merge DFS read calls could potentially block search
> > DFS
> > >> > read calls?
> > >> >
> > >>
> > >> Yes.
> > >>
> > >>
> > >> >
> > >> > Would it be a good idea to have 2 DFSInputStreams for every file,
> one
> > >> for
> > >> > merge & another for search?
> > >> >
> > >>
> > >> In newer versions of the code there are multiple streams involved.
> One
> > >> for
> > >> each open file handle plus if a sequential read is detected a new
> stream
> > >> is
> > >> created for the instance for better performance.  Checkout the
> > >> HdfsDirectory class.
> > >>
> > >> Aaron
> > >>
> > >>
> > >> >
> > >> > On Tue, May 10, 2016 at 7:43 PM, Ravikumar Govindarajan <
> > >> > ravikumar.govindarajan@gmail.com> wrote:
> > >> >
> > >> > > Sorry, I mis-understood the code.
> > >> > > I see that it has 2 locks IndexRefreshWriteLock &
> > >> IndexRefreshReadLock.
> > >> > > They look to be separate
> > >> > >
> > >> > > On Tue, May 10, 2016 at 7:16 PM, Ravikumar Govindarajan <
> > >> > > ravikumar.govindarajan@gmail.com> wrote:
> > >> > >
> > >> > >> Thanks a lot Aaron.
> > >> > >>
> > >> > >> I guess we took a commit of 0.2.2 that doesn't have the
> > >> > >> IndexRefreshWriteLock (IRWL). It looks like it co-ordinates
> between
> > >> > >> searches & incoming mutation commits. If so, then it
will likely
> > >> solve
> > >> > the
> > >> > >> first issue for us (AlreadyClosedException)
> > >> > >>
> > >> > >>
> > >> > >> Can you recollect if that was the reason IRWL was introduced?
> > >> > >>
> > >> > >> On Tue, May 10, 2016 at 6:40 PM, Aaron McCurry <
> amccurry@gmail.com
> > >
> > >> > >> wrote:
> > >> > >>
> > >> > >>> On Tue, May 10, 2016 at 2:30 AM, Ravikumar Govindarajan
<
> > >> > >>> ravikumar.govindarajan@gmail.com> wrote:
> > >> > >>>
> > >> > >>> > Actually there are 2 issues...
> > >> > >>> >
> > >> > >>> > 1. IndexReaderClosedException
> > >> > >>> > 2. HDFS Stream Closed
> > >> > >>> >
> > >> > >>>
> > >> > >>> Likely when the index is closed it closes the underlying
> > >> indexinputs as
> > >> > >>> well causing the HDFS Stream closed exception.
> > >> > >>>
> > >> > >>>
> > >> > >>> >
> > >> > >>> > Merge completion results in File Deletion &
ultimately HDFS
> > Stream
> > >> > >>> Closed
> > >> > >>> > during Search....
> > >> > >>> >
> > >> > >>> > I use IndexFileDeleter with KeepOnlyLastCommitDeletionPolicy.
> > This
> > >> > >>> blindly
> > >> > >>> > deletes the file, without bothering to cross-check
> > >> > >>> IndexReader.RefCount >
> > >> > >>> > 0.
> > >> > >>> >
> > >> > >>>
> > >> > >>> Hmm.  You can see here:
> > >> > >>>
> > >> > >>>
> > >> > >>>
> > >> >
> > >>
> >
> https://github.com/apache/incubator-blur/blob/release-0.2.2-incubating/blur-core/src/main/java/org/apache/blur/manager/writer/BlurIndexSimpleWriter.java#L303
> > >> > >>>
> > >> > >>> That once the new index is available it is swapped into
the
> index
> > >> ref
> > >> > >>> object and the old one is sent to the index closer. 
Once the
> ref
> > to
> > >> > the
> > >> > >>> index are low enough it closes the index.  Or at least
it
> should.
> > >> > >>>
> > >> > >>> I will continue looking into the problem but I don't
have a
> > solution
> > >> > for
> > >> > >>> you yet.
> > >> > >>>
> > >> > >>> Aaron
> > >> > >>>
> > >> > >>>
> > >> > >>>
> > >> > >>> >
> > >> > >>> >
> > >> > >>> > *Exception(message:Unknown error during rewrite,
> > >> > >>> > stackTraceStr:java.io.IOException: Stream closed*
> > >> > >>> > at
> > >> > >>>
> > >> org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1385)
> > >> > >>> > at
> > >> > org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1374)
> > >> > >>> > at
> > >> > >>>
> > >> org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:89)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.hdfs.HdfsIndexInput.readInternal(HdfsIndexInput.java:62)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:167)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.buffer.ReusedBufferedIndexInput.readBytes(ReusedBufferedIndexInput.java:122)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.hdfs.MmapCacheIndexInput.readAndcache(MmapCacheIndexInput.java:24)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.blockcache_v2.CacheIndexInput.fillNormally(CacheIndexInput.java:354)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.blockcache_v2.CacheIndexInput.fill(CacheIndexInput.java:379)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.blockcache_v2.CacheIndexInput.tryToFill(CacheIndexInput.java:297)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.store.blockcache_v2.CacheIndexInput.readByte(CacheIndexInput.java:151)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.lucene.warmup.TraceableIndexInput.readByte(TraceableIndexInput.java:62)
> > >> > >>> > at
> > org.apache.lucene.store.DataInput.readVInt(DataInput.java:108)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum$Frame.loadBlock(BlockTreeTermsReader.java:2366)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader$SegmentTermsEnum.seekCeil(BlockTreeTermsReader.java:1949)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.index.ExitableReader$ExitableTermsEnum.seekCeil(ExitableReader.java:250)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.index.FilteredTermsEnum.next(FilteredTermsEnum.java:225)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:78)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
> > >> > >>> > at
> > >> > >>>
> > >> >
> > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
> > >> > >>> > at
> > >> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
> > >> > >>> > at
> > >> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
> > >> > >>> > at
> > >> > org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
> > >> > >>> > at
> > >> > >>> >
> > >> > >>> > On Mon, May 9, 2016 at 4:42 PM, Ravikumar Govindarajan
<
> > >> > >>> > ravikumar.govindarajan@gmail.com> wrote:
> > >> > >>> >
> > >> > >>> > > One extra info we gleaned from the logs...
> > >> > >>> > >
> > >> > >>> > > 1. Merge Starts & is about to complete
> > >> > >>> > > 2. Searcher is opened
> > >> > >>> > > 3. Merge Completes
> > >> > >>> > > 4. Ref-count drops to 0 in IndexReader
> > >> > >>> > > 5. IndexReader closed while Searcher is still
open
> > >> > >>> > >
> > >> > >>> > > This seems to be the main pattern for causing
the Exception
> > >> > >>> > >
> > >> > >>> > > --
> > >> > >>> > > Ravi
> > >> > >>> > >
> > >> > >>> > > On Mon, May 9, 2016 at 3:08 PM, Ravikumar Govindarajan
<
> > >> > >>> > > ravikumar.govindarajan@gmail.com> wrote:
> > >> > >>> > >
> > >> > >>> > >> Thanks Aaron...
> > >> > >>> > >>
> > >> > >>> > >> Just a quick question. Lucene itself has
ref-counting to
> > close
> > >> > it's
> > >> > >>> > >> readers no? Or Blur has it's own logic
to handle it?
> > >> > >>> > >>
> > >> > >>> > >> --
> > >> > >>> > >> Ravi
> > >> > >>> > >>
> > >> > >>> > >> On Fri, May 6, 2016 at 7:56 PM, Aaron McCurry
<
> > >> amccurry@gmail.com
> > >> > >
> > >> > >>> > wrote:
> > >> > >>> > >>
> > >> > >>> > >>> Likely yes.  If have a few minutes
this weekend I can look
> > >> > through
> > >> > >>> that
> > >> > >>> > >>> version and see if I can point you
in the right direction.
> > >> > >>> > >>>
> > >> > >>> > >>> On Fri, May 6, 2016 at 8:46 AM, Ravikumar
Govindarajan <
> > >> > >>> > >>> ravikumar.govindarajan@gmail.com>
wrote:
> > >> > >>> > >>>
> > >> > >>> > >>> > Sometimes during an ongoing search
we receive an
> > >> > >>> > >>> > IndexReaderClosedException...
> > >> > >>> > >>> >
> > >> > >>> > >>> > We are on an older version of
Blur (0.2.2). Has this
> been
> > >> fixed
> > >> > >>> in
> > >> > >>> > >>> newer
> > >> > >>> > >>> > versions or we have been using
it wrongly?
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> *stackTraceStr:org.apache.lucene.store.AlreadyClosedException:
> > >> > >>> this
> > >> > >>> > >>> > IndexReader cannot be used anymore
as one of its child
> > >> readers
> > >> > >>> was
> > >> > >>> > >>> closed*
> > >> > >>> > >>> > at
> > >> > >>> >
> > >> org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:257)
> > >> > >>> > >>> > at
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.index.FilterAtomicReader.fields(FilterAtomicReader.java:380)
> > >> > >>> > >>> > at
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.blur.index.ExitableReader$ExitableFilterAtomicReader.fields(ExitableReader.java:81)
> > >> > >>> > >>> > at
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.TermCollectingRewrite.collectTerms(TermCollectingRewrite.java:52)
> > >> > >>> > >>> > at
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.ConstantScoreAutoRewrite.rewrite(ConstantScoreAutoRewrite.java:95)
> > >> > >>> > >>> > at
> > >> > >>> > >>> >
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > >>
> >
> org.apache.lucene.search.MultiTermQuery$ConstantScoreAutoRewrite.rewrite(MultiTermQuery.java:220)
> > >> > >>> > >>> > at
> > >> > >>> > >>>
> > >> > >>> >
> > >> > >>>
> > >> >
> > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:288)
> > >> > >>> > >>> > at
> > >> > >>> >
> > >> org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:412)
> > >> > >>> > >>> >
> > >> > >>> > >>>
> > >> > >>> > >>
> > >> > >>> > >>
> > >> > >>> > >
> > >> > >>> >
> > >> > >>>
> > >> > >>
> > >> > >>
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message