incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [lucy-dev] Index modernizer
Date Thu, 11 Nov 2010 16:18:30 GMT
On Thu, Nov 11, 2010 at 09:22:00AM -0600, Peter Karman wrote:
> > As the index format changes, we accumulate cruft in our codebase to support
> > old indexes and old segments.  At some point, we need to purge such cruft and
> > abandon support for old indexes.  But if you are a user, it's hard to know
> > whether your index has old segments in it, and whether you can upgrade safely
> > to a given version of the library.
> 
> You're describing the back compat path for KS users switching to Lucy, yes?

The "modernizer" approach addresses a general problem for the Lucy/Lucene
segmented index design, and it will be useful at every major index format
break going forward.  But yes, I'm thinking that the first use case would be
dropping support for segments originally written under KinoSearch.

I've recently whipped up a patch that allows Lucy to read KinoSearch indexes.
All that we need to do is alias a few class names so that deserializing a
Schema works properly -- i.e. when the Schema JSON file contains a serialized
"KinoSearch::Analysis::CaseFolder", the object that emerges from the
deserializer is a Lucy::Analysis::CaseFolder.

However, the Lucy codebase currently supports a number of obsolete KinoSearch
segment formats, and it would be nice to drop that support and clean out the
cruft at some time in the future.  For whatever Lucy release we decide to do
that on, we would announce that users who had migrated indexes from KinoSearch
need to run the modernizer.  (Indexes initialized under Lucy would not need
modernization, as we could guarantee that they were written in a recent
format.)

Providing a clean migration path for KinoSearch users allows us to put
KinoSearch into maintenance mode and focus exclusively on developing Lucy.  At
the same time, having the modernizer in reserve holds the promise that we
won't be burdened forever by the backwards compatibility requirements of old
index formats.

> Maybe write the cookbook recipe for upgrading KS to Lucy, and then we can
> see if it needs to be formalized into a part of the core?

OK, sounds like the cookbook approach is feasible and prudent.  We don't
technically *need* the modernizer until we decide to drop support for an old
format, though.

Marvin Humphrey


Mime
View raw message