lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adrien Grand <jpou...@gmail.com>
Subject Re: Purging unused fields during merges
Date Mon, 18 Jan 2016 20:01:15 GMT
Hi Erick,

It is not clear to me what remaining data you are looking to get rid of.

You could purge unused field numbers by calling IndexWriter.addIndexes on a
reader wrapper that removes unused fields. But this is not something that
Lucene would do by itself. The fact that field numbers are reused is useful
in order to be able to copy raw bytes when merging stored fields (otherwise
you would have to decode/recode everything in order to remap field numbers).

Maybe what you remember are these 2 issues that improved the memory usage
of FieldInfos in the sparse case?
 - https://issues.apache.org/jira/browse/LUCENE-6325
 - https://issues.apache.org/jira/browse/LUCENE-6630

Le lun. 18 janv. 2016 à 20:37, Erick Erickson <erickerickson@gmail.com> a
écrit :

> I _swear_ I've seen this go by before, but can't find it.
>
> Let's say I have removed _all_ documents from my index that mention a
> particular field (dynamic in this case). Merging segments apparently
> does not remove that data from the merged segment, and the extra
> information survives restarts, optimizations and all that. I did get
> it to work by chance when I had a single segment, and updated _all_
> the docs in that segment without the extra fields so some quick tests
> will succeed if you're not careful.
>
> Usually, the answer is "who cares? A field or two "extra" won't have
> any effect". In this case, though, there are over 20K extra dynamic
> fields added mistakenly (which is why I like to remove dynamic field
> definitions from solr when possible).
>
> This is true on 5x.
>
> Ring any bells?
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

Mime
View raw message