lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: updating index
Date Thu, 22 Feb 2007 00:02:38 GMT
I think you can get MUCH better efficiency by using TermEnum/TermDocs. But I
think you need to index (UN_TOKENIZED) your primary key (although now I'm
not sure. But I'd be surprised if TermEnum worked with un-indexed data.
Still, it'd be worth trying but I've always assumed that TermEnums only
worked on indexed fields....).....

Anyway, your loop looks more like this...

TermEnum terms = IndexReader.terms(new Term("primarykey", ""));
TermDocs tDocs = IndexRreader.termDocs();

while (terms.next()) {
   if (docsToUpdate.contains(terms.text()) {
       tDocs.seek(terms.term());
       writer.updateDocument(tDocs.doc());
   }
}

NOTE: I've been fast and loose with edge conditions, like insuring that
while (terms.next()) doesn't skip the first term, so caveat emptor.... This
loop also assumes that there is one and only one document in your index with
the primary key. Otherwise, you have to do some more work with the TermDocs
class to process each document that has your primary key...

This is similar to creating Lucene filters, which is very fast....

Hope this helps
Erick



On 2/21/07, no spam <mrs.nospam@gmail.com> wrote:
>
> I have an index where I'm storing the primary key of my database record as
> an unindexed field.   Nightly I want to update my search index with any
> database changes / additions.
>
> I don't really see an efficient way to update these records besides doing
> something like this which I'm worried with thrash the index.  Is this
> approach good/bad/ugly?
>
> Thanks,
> Mark
>
> IndexReader reader;
> ArrayList docsToUpdate;
>
> for (int i = 0; i < reader.maxDoc(); i++)
> {
>     Document doc = reader.document(i);
>     if (doc != null)
>     {
>        String prinaryKey = doc.getField("id");
>
>         if (docsToUpdate.contains(primaryKey))
>         {
>              // set fields
>              writer.updateDocument(doc);
>         }
> }
>
> // for all docs not found in index
> for (DBObject o : docsToUpdate)
> {
>     if (o.syncedWithIndex() == false)
>     {
>        // create new doc
>       Document doc = ....;
>
>        // this is a new doc
>        writer.addDocument(doc);
>     }
> }
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message