lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <anshum.gu...@naukri.com>
Subject Re: Controlling index file name
Date Fri, 04 Apr 2008 04:55:45 GMT
Hi,

I guess (but 'm not quite sure) you are looking for a way to
incrementally index(+update existing index), there would be a lot info
available on the same. 
What I would suggest would be deleting the indexes from the current
index using deleteDocuments
(http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/index/IndexWriter.html#deleteDocuments(org.apache.lucene.index.Term))
and follow it up by adding the same (but updated) docs into the index. You could periodically
merge the indexes (using lucene merger) to avoid the incremental index to grow to a huge size.

To search the indexes you could use a multi searcher for the incremental
index and base index until merged.
Important to mention, for this you would need to design someway to
lookup and compile a list of modified docs(to be newly indexed or
reindexed) and pass it to lucene while indexing.

--
Anshum


On Thu, 2008-04-03 at 18:27 +0530, Bhavin Pandya wrote:
> I also faced same problem in past.
> But in my case the index size was not the issue so i maintained two folder
> "newindex" and "oldindex"... and swaping at every update.
> 
> -Bhavin pandya
> 
> ----- Original Message -----
> From: "021336" <smith_brad@bah.com>
> To: <java-user@lucene.apache.org>
> Sent: Tuesday, April 01, 2008 9:44 PM
> Subject: Controlling index file name
> 
> 
> >
> > We use Lucene to create simple data stores that we deploy with our
> > application.  Our application also supports auto-updating and we refresh
> > these data stores monthly.  Since Lucene computes the names for the index
> > we
> > end up deploying new files each time while leaving the old files to
> > continue
> > taking up space needlessly.
> >
> > The options I see to resolve this are:
> > - After each update check all the data store locations and delete CFS
> > files
> > that have a date earlier then the Segments files, or
> > - Create the index files with a consistent name so that we overwrite
> > existing files with new files during each auto-update.
> >
> > I found one thread discussing this using "AddIndexes" but I could not
> > follow
> > it through to implementation.  Also, we are using Java 1.4 not 1.5.
> >
> > Any suggestions?
> >
> >
> > --
> > View this message in context:
> > http://www.nabble.com/Controlling-index-file-name-tp16418709p16418709.html
> > Sent from the Lucene - Java Users mailing list archive at Nabble.com.
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message