lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: checking existing docs before indexing
Date Thu, 12 Jul 2007 14:17:52 GMT
You have to check yourself. Lucene has no concept of relations
*between* documents. What you're really asking for is something
like a database unique key. No such luck, you have to create
one yourself.

What I've done is post-process the entire index, removing duplicates.
This can be done quite efficiently with TermDocs/TermEnum, and you
can then institute policies like, say, LIFO or FIFO.

You could also certainly check before adding a document, also
using TermEnum/TermDocs.

Best
Erick

On 7/12/07, Heba Farouk <heba.farouk@yahoo.com> wrote:
>
> Hello
> i'm a newbie to lucene world and i hope that u help me.
> i was asking is there any options in IndexWriter to check if a document
> already exsits before adding it to the index or i should maintain it
> manually ??
>
> thanks in advance
>
>
> Yours
>
> Heba
>
>
> ---------------------------------
> Choose the right car based on your needs.  Check out Yahoo! Autos new Car
> Finder tool.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message