lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ivan Vasilev" <>
Subject Lucene index update
Date Thu, 04 Jan 2007 15:14:49 GMT
Hi All,

I want to update some documents in existing indexes by adding a new field to 
each of their documents. The documents contained in the indexes have some 
fields that are indexed and NOT stored. The new field that will be added 
will contain some metadata and will be Stored and not indexed.

To fulfill the task I think I have 4 possible approaches:

1. To use Lucene API to do this. According to my research I found that 
Lucene does not have methods for updating documents directly.

2. To extract documents including their fields (including the most important 
fields - those that are indexed but not stored). Then to add the new field 
with relevant value to these documents, delete them from the index and 
finally add them again in the index (including their new field).

    This is the approach recommended in the book "Lucene in Action" and in 
the FAQ page on the Lucene's site - delete documents and add them again.

    The problem here is that according to Lucene's documentation "fields 
which are not stored are not available in documents retrieved from the 
index, e.g. with Hits.doc(int), Searcher.doc(int) or 
IndexReader.document(int)". (The tests showed the same :) ). So if this 
approach could work the problem here is: How to extract information of the 
fields that are indexed but not stored?

3. To create a tool that changes the data stored in the indexes. This 
approach seems to be too "dangerous" but if anyone knows the format that 
Lucene uses to keep these data please tell me.

4. To reindex - means to delete existing indexes and create them in new by 
adding the necessary fields. As the indexes could be very large creating 
them in new will be too slow process and so this approach is least 
desirable. But I am afraid it is the only one that is practically possible.

If someone could help me with the approaches 1-3 or tell me some other 
please help me :)

Thanks in advance,


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message