lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Gershman <gregge...@yahoo.com>
Subject Re: Help with mass delete from large index
Date Mon, 13 Feb 2006 15:24:14 GMT
No problem; this is not meant to be a regular
operation, rather it's a (hopefully) one-time thing
till the index can be restructured.

The data is chronological in nature, deleting
everything before a specific point in time.  The index
is optimized, so is it possible to remove specific
files?  I'm open to other suggestions as to how to
approach this.

Also, I neglected to mention I'm using version 1.4.3.

Greg 


--- "Michael D. Curtin" <mike@curtin.com> wrote:

> Greg Gershman wrote:
> 
> > I'm trying to delete a large number of documents
> > (~15million) from a a large index (30+ million
> > documents).  I've started with an optimized index,
> and
> > a list of docIds (our own unique identifier for a
> > document, not a Lucene doc number) to pass to the
> > IndexReader.delete(Term t) method.  I've had a few
> > different problems.
> > ...
> > Any ideas?  I'm really confused, and the only
> other
> > option I can think of is to reindex the documents
> I
> > need, which would take much longer than deleting
> the
> > ones I dont.
> 
> Maybe it would be useful to take a step back up the
> tree of abstractions here 
> and reexamine why you're deleting such a large
> fraction of your index, 
> particularly if you're doing it on a regular basis. 
> For example, is there a 
> chronological or other "natural" break in the data
> such that you could make 2 
> indexes with ~15M docs each in the first place, then
> just delete a few index 
> *files* instead of 15M documents, one at a time?
> 
> --MDC
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message