lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: incremental update of index
Date Mon, 10 Nov 2008 21:45:35 GMT
It all depends on how many updates you're doing, which
you haven't told us <G>.

If a large majority of your index is being updated, there's
no particular reason to update, I'd build a new one.

Best
Erick

On Mon, Nov 10, 2008 at 3:09 PM, ChadDavis <chadmichaeldavis@gmail.com>wrote:

> That's what I thought.
>
> So, that leads me to  . . .  is it necessarily all that much faster to
> index
> in an incremental update fashion, rather than just clobbering the old
> index?
>
> On Mon, Nov 10, 2008 at 12:52 PM, Erick Erickson <erickerickson@gmail.com
> >wrote:
>
> > You have to have indexed something that uniquely identifies the
> > document in order to know what the old one is. Really, this is
> > the same question as updating, isn't it? If you could update
> > a document in place, you'd have to know what document
> > that was. If you know that information, you know which
> > document to delete.
> >
> > Note that lucene has no built-in document recognition. If I
> > add the same document to the index twice, Lucene will
> > happily consider them two *separate* documents. You have
> > to code your own notion of document meta-id (as distinct
> > from the Lucene doc id). It could be the URL, the file path
> > on disk, a document ID from your organization... the
> > possibilities are endless. Which is why Lucene can't do that
> > for you.
> >
> > Best
> > Erick
> >
> > On Mon, Nov 10, 2008 at 2:22 PM, ChadDavis <chadmichaeldavis@gmail.com
> > >wrote:
> >
> > > In the FAQ's it says that you have to do a manual incremental update:
> > >
> > > How do I update a document or a set of documents that are already
> > indexed?
> > > >
> > > > There is no direct update procedure in Lucene. To update an index
> > > > incrementally you must first *delete* the documents that were
> updated,
> > > and
> > > > *then re-add* them to the index.
> > > >
> > >
> > > How do I determine the existing document that matches the document I am
> > > updating?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message