lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ChadDavis <chadmichaelda...@gmail.com>
Subject Re: incremental update of index
Date Mon, 10 Nov 2008 20:09:04 GMT
That's what I thought.

So, that leads me to  . . .  is it necessarily all that much faster to index
in an incremental update fashion, rather than just clobbering the old index?

On Mon, Nov 10, 2008 at 12:52 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> You have to have indexed something that uniquely identifies the
> document in order to know what the old one is. Really, this is
> the same question as updating, isn't it? If you could update
> a document in place, you'd have to know what document
> that was. If you know that information, you know which
> document to delete.
>
> Note that lucene has no built-in document recognition. If I
> add the same document to the index twice, Lucene will
> happily consider them two *separate* documents. You have
> to code your own notion of document meta-id (as distinct
> from the Lucene doc id). It could be the URL, the file path
> on disk, a document ID from your organization... the
> possibilities are endless. Which is why Lucene can't do that
> for you.
>
> Best
> Erick
>
> On Mon, Nov 10, 2008 at 2:22 PM, ChadDavis <chadmichaeldavis@gmail.com
> >wrote:
>
> > In the FAQ's it says that you have to do a manual incremental update:
> >
> > How do I update a document or a set of documents that are already
> indexed?
> > >
> > > There is no direct update procedure in Lucene. To update an index
> > > incrementally you must first *delete* the documents that were updated,
> > and
> > > *then re-add* them to the index.
> > >
> >
> > How do I determine the existing document that matches the document I am
> > updating?
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message