Sometimes it's REALLY impossible to reindex, or has absolutely prohibitive cost to do in a running production system (i can't shut it down for maintainance, so i need a lot of hardware to reindex ~5 billion documents, i have no idea what are the costs to retrieve that data all over again, but i estimate it to be quite a lot)

And providing a way to migrate existing indexes to new lucene is crucial from my point of view.

I don't care what this way is: calling optimize() with newer lucene or running some tool that takes 5 days, it's ok with me.

Just don't put me through full reindexing as I really don't have all that data anymore.
It's not my data, i just receive it from clients, and provide a search interface.

It took years to build those indexes, rebuilding is not an option, and staying with old lucene forever just sucks.


On Thu, Apr 15, 2010 at 14:57, Robert Muir <> wrote:

On Thu, Apr 15, 2010 at 7:52 AM, Shai Erera <> wrote:
Well ... I must say that I completely disagree w/ dropping index structure back-support. Our customers will simply not hear of reindexing 10s of TBs of content because of version upgrades. Such a decision is key to Lucene adoption in large-scale projects. It's entirely not about whether Lucene is a content store or not - content is stored on other systems, I agree. But that doesn't mean reindexing it is tolerable.

I don't understand how its helpful to do a MAJOR version upgrade without reindexing... what in the world do you stand to gain from that? 

The idea here, is that development can be free of such hassles. Development should be this way.

If you, Shai, need some feature X.Y.Z from Version 4 and don't want to reindex, and are willing to do the work to port it back to Version 3 in a completely backwards compatible way, then under this new scheme it can happen.

Robert Muir