lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki ...@getopt.org>
Subject Re: Optimize and internal document order
Date Fri, 31 Aug 2007 08:04:56 GMT
Karl Wettin wrote:
> 
> 30 aug 2007 kl. 22.50 skrev Andrzej Bialecki:
> 
>> I think this is possible to achieve by using a FilterIndexReader, 
>> which keeps a map of updated documents, and re-maps old doc ids to the 
>> new ones on the fly.
>>
>> From time to time I'd like to optimize the "aux" index to get rid of 
>> deleted docs. At this time I need to figure out how to preserve the 
>> old->new mapping during the optimization.
> 
> Perhaps my experiment in LUCENE-879 could be of interest.

Thanks for the pointer. As far as I understand the patch, this doesn't 
help me, because I want to remap a newly added document so that it 
appears to be at the old position. Eventually, when the index becomes 
full of deletions, I want to optimize it, i.e. get rid of dead weight 
completely.

Please correct me if I'm wrong, but your patch replaces deleted 
documents with empty documents, which over time leads to an unchecked 
growth of the index, which is hardly equivalent to optimize() ... so, 
the main benefit I can see from your patch is that it fixes the tf / idf 
to account for deleted docs.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message