lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: problem found with DiskDocValuesFormat
Date Thu, 22 Aug 2013 06:15:30 GMT
On Thu, Aug 22, 2013 at 1:48 AM, Sean Bridges <sean.bridges@gmail.com> wrote:
> Is there a supported DocValuesFormat that doesn't load all the values into
> ram?

Not with any current release, but in lucene 4.5 if all goes well, the
official implementation will work that way (I spent essentially the
last entire week on this and committed it yesterday).

Integrating new ideas into the official format takes a good amount of
effort: lots of documentation and testing and so on, because we have
to live with supporting that format for a long time, all the way until
5.9.
.
>
> We can't reindex every time we upgrade lucene since our indexes are too
> large.  Should we copy the code from DiskDocValuesFormat and call it
> CustomDiskDocValuesFormat, and give CustomDiskDocValuesFormat a new name so
> that when we upgrade lucene, we won't use an incompatible version of
> DiskDocValuesFormat?
>

You can certainly maintain your own codec components: you can even
name them the same thing as long as you put your .jar file first in
the classpath (thats how SPI works: first one wins).

But its not really a cure-all, its some work either way: codec APIs
themselves change too, so you have to deal with that on upgrade (e.g.
DocValuesProducer gets a new method in 4.5 as its now capable of
representing missing values, and the iterators in DocValuesConsumer
now provide null when a document is missing a value).

Or, you can avoid reindexing by using addIndexes as I suggested (just
buy a few big chips of RAM for the upgrade).

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message