lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Subhrajyoti Moitra" <subhrajyo...@contata.co.in>
Subject differential indexing
Date Mon, 07 Apr 2003 05:54:45 GMT
Hi,
I am trying to append one index to another. How should i do it?

Let me explain my problem, probably people can suggest some better way..

i have indexed a set of pdf documents. These documents are being retrieved from the DB. I
have a unique docId associated with each document. This is one of the fields in the index
entries. I am using pdfbox to parse the contents of the pdf document and convert it into text
for indexing.

Scenario-I
Now some one adds a new document to the DB. What i am presently doing is that, i retrieve
all the documents from the DB,
 including the new one, and create a fresh index out of these set of documents. The problem
here is time.
I have some 10,000 documents in my system, re-indexing every one of them again is taking a
hell-of-a long time.
What i want is to "APPEND" the new document index-data to the existing index.

Scenario-II
When an existing document in the DB is changed i want to remove that document from the index
(this is easy since i have the unique docId with me) and add the new modified index-data to
the existing index, instead of again recreating the entire index.

To sum up how do i do differential indexing. (hope i am using the proper terminology)

Some one please suggest some solutions to this.

Thank you in advance.
Subhro.
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message