lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rob Audenaerde <rob.audenae...@gmail.com>
Subject Re: Lucene update performance
Date Tue, 09 May 2017 14:06:07 GMT
As far as I know, the updateDocument method on the IndexWriter delete and
add. See also the javadoc:

[..] Updates a document by first deleting the document(s)
    containing term and then adding the new
    document.  The delete and then add are atomic as seen
    by a reader on the same index (flush may happen only after
    the add). [..]


On Tue, May 9, 2017 at 3:37 PM, Kudrettin Güleryüz <kudrettin@gmail.com>
wrote:

> I do update the entire document each time. Furthermore, this sometimes
> means deleting compressed archives which are stores as multiple documents
> for each compressed archive file and readding them.
>
> Is there an update method, is it better performance than remove then add? I
> was simply removing modified files from the index (which doesn't seem to
> take long), and readd them.
>
> On Tue, May 9, 2017 at 9:33 AM Rob Audenaerde <rob.audenaerde@gmail.com>
> wrote:
>
> > Do you update each entire document? (vs updating numeric docvalues?)
> >
> > That is implemented as 'delete and add' so I guess that will be slower
> than
> > clean sheet indexing. Not sure if it is 3x slower, that seems a bit much?
> >
> > On Tue, May 9, 2017 at 3:24 PM, Kudrettin Güleryüz <kudrettin@gmail.com>
> > wrote:
> >
> > > Hi,
> > >
> > > For a 5.2.1 index that contains around 1.2 million documents, updating
> > the
> > > index with 1.3 million files seems to take 3X longer than doing a
> scratch
> > > indexing. (Files are crawled over NFS, indexes are stored on a
> mechanical
> > > disk locally (Btrfs))
> > >
> > > Is this expected for Lucene's update index logic, or should I further
> > debug
> > > my part of the code for update performance?
> > >
> > > Thank you,
> > > Kudret
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message