lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hammer" <ham...@dezide.com>
Subject updating a document
Date Wed, 20 Aug 2003 07:22:54 GMT
Hello

I'm trying to update a document in my index. As far as i can tell from the FAQ and other places
of documentation, the only way to do this is by deleting the document and adding it again.

Now, I want to be able to add the document a new but keep from having to re-parse the original
file again. That is i want to extract a document from the index (and keep a copy in memory),
delete the document from the index, update a field in the doc in memory and add the doc to
index once again.

I imagine that it has to be done something like this :

1. extract the desired document from index with a function returning the document (not complete
code):

Document doc;
String fileNameToGetFromIdx
String tempName

   for ( int i = 0; i < numDocs; i++ )
   {
    if ( !indexreader.isDeleted( i ) )
    {
     doc = indexreader.document( i );

     if ( doc != null )
     {
      tmpName = ( doc.get( "pathToFileOnDisk" ) ); 

      if ( tmpName.equals( fileNameToGetFromIdx ) )
      {
       ir.delete( i );
       return doc;
      }
     }
    }
   }

This would leave me with the document in memory and the document deleted from the index -right?


2. update a field in the document by adding the field again.

 doc.add( Field.Text( "someField", "value" ) );

The API for Document says that if multiple fields exists with the same name, the value of
the last value added is returned when getting the value.


3. add the document to the index again

indexwriter.addDocument( doc );


Is this a correct way of doing an update because i can't seem to get i to work properly.
The reason for trying it this way is to not having to reindex the original file again. I have
many large PDF documents which takes some time to index :-(

Bottom line -when i do a search and a list of search results are displayed to the user, the
user clicks the title of the document and the document is shown to the user. Before the document
is shown i execute an update function to increase the number of times the documents has been
visited -hence i need to update the visited field of that particular document in the index.

Uhm -hope you get the idea :-)

Any suggestions and comments are very welcome

thanks in advance

BTW : does anyone know if an update function is planned to be added to lucene? Would it be
hard to write it yourself?


/Lars Hammer

www.dezide.com
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message