lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <mar...@rectangular.com>
Subject Re: [KinoSearch] Release strategies
Date Fri, 15 Jan 2010 04:22:32 GMT
On Thu, Jan 14, 2010 at 10:20:34AM -0600, Peter Karman wrote:
> Are the file format problems actually bugs in the current format, or features 
> you would like to see added?

Both.  With regards to features, there are two:

  * Make term dictionaries pluggable to work with non-text field types.
  * Make posting formats able to work with multiple streams.

We can probably handle those without hard compatibility breaks; it will just
be more of a pain. 

Then there's one long standing file format bug: 

  * Skipping on SegPostingList is disabled because the skip files are broken.

I'd like to get that one taken care of before we make a non-dev release, as it
will help performance for a variety of queries.

> CPAN's versioning is not ideal in that regard.

I'll be less kind.  It's grievously flawed.

But these problems are also just hard to avoid by nature when dealing with
dynamic dependencies and global namespaces.  We have the same problem with C.

> However, there are already checks in Build.PL for incompatible index formats, 

If we release into a new "KinoSearch2" namespace, those checks can be
discarded.  They're pretty unfriendly to the CPAN toolchain -- you can get
yourself on a CPAN Testers blacklist by hanging on manual user input.

> Rebuilding an index is not the end of the world. We (and by we I mean search 
> developers) do it all the time, even with big doc corpora.

If you have a lot of fast moving indexes, and an expectation of uninterrupted
up-time, it can be difficult to schedule swaps.  That's our situation here at
Eventful.  We could do it, but it would be a major PITA.

Ironically, this pushes in the direction of release, because managing a hard
compat break would probably cost us more than it costs us to have me write
bridge code.

> Small, stable, incremental and frequent releases to CPAN. I've been converted to 
> that idea.

I've also seen the benefits of date-driven release schedules, for instance as
now practiced by the Perl 5 Porters.  Jesse Vincent has managed to get a bunch
of good devs more highly involved by separating the roles of release mananger
and pumpking.

In the abstract, I'd like to try that.  It would require changes to my personal
development routines, and Git is better suited to it than Subversion, but I
think we could make it work.  

> What about #3: stabilize svn trunk and release it as KS 0.30.
> 
> When there's another index compat change, release it as KS 0.40, etc.

I don't want to screw over the MojoMojo people, the Socialtext people, etc. By
releasing into a new namespace we give our users a lot more options for
transitioning.

I actually think we might be able to go a long way without hard compat breaks
in the file format.  Maybe even from here on out.  Now that all metadata is in
JSON, the sort cache issue is solved, and we have a provisional implementation
for pluggable index components, it should be a lot easier to keep compat
across a transitional release at least, and usually longer.  

> How can I help move toward a KS 0.30 release?

I'll draw up a todo list.

Marvin Humphrey


Mime
View raw message