incubator-lucy-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marvin Humphrey <>
Subject Roadmap for first release
Date Sun, 18 Jul 2010 17:59:40 GMT

At the risk of counting our chickens before they hatch, the proposal to
assimilate the KinoSearch codebase and the move to the Incubator seems likely
to pass the lazy consensus vote now underway.  :)  It's time to start thinking
about what ought to be in Lucy's first release.

I propose a minimalist strategy that will allow us to get to a release as
quickly as possible:

  1. Branch Lucy off the last KS bugfix release rather than svn trunk.
  2. Perform IP clearance and relicensing.
  3. Perform a few massive find and replace operations to change the imported
     codebase to "Lucy".
  4. Add code to enable Lucy to read existing KinoSearch 0.3x indexes.
  5. Consider moving a few classes around.
  6. Write a Lucy::Docs::KinoSearch2Lucy transition guide and a
     "" tool to adapt user codebases using 0.3x

A fair amount of work has been done on KinoSearch's svn trunk since the last
bugfix release, but it won't be hard to reproduce that work, and there aren't
any IP issues that require those commits to go through IP clearance.  By going
with code that has already lived in production environments, we give ourselves
a better chance at making a good first impression via code that "just works"
for new users.  New users will surely mean new bug reports and we should plan
to make Lucy bugfix releases, but hopefully by minimizing churn we will make
it possible to focus on user support and evangelizing in the wake of the
initial release rather than bughunting.

With regards to moving a few classes around... To paraphrase Yonik Seely,
every time you change an API, you destroy part of the community's collective
memory.  There are some migrations that were already underway, such as moving
Similarity out of Search and underneath Index.  There are other migrations we
should consider now, such as moving Stemmer outside of core, to
LucyX::Analysis::SnowballStemmer.  IMO, it would be better to complete such
moves prior to the first Lucy release, rather than destroy collective memory

It probably makes sense to make one more KinoSearch release addressing some of
the issues for the transition.  In particular, in svn trunk, FullTextType and
StringType have been consolidated into TextType.  It would be nice if new Lucy
users never had to think about distinguishing between FullTextType and
StringType, since they're going away anyway.

I think we draw the line at moving classes around, though.  There are a number
of other issues that need to be addressed before we fork off a stable "Lucy1"
branch.  For instance, I think multi-stream posting files should be a blocker
issue for Lucy1 because of the search-time performance implications, and
there's been a fair amount of work done in svn trunk towards resolving that
issue.  However, I think we should take Hoss's advice into account when
scheduling such issues:

    that was the the hardest thing for me to wrap my head arround when Solr
    was incubating -- in many ways i was actively trying to keep Solr a
    "secret" until i felt like it was "ready to be unvield" but that's not
    what incubation is about, and it's really teh antithesis of how to have
    asuccessful project -- you don't get a lot of contributors all at once by
    saying "here it is, we've got something that's stable and solid and
    'done', who wants to come be a part of it?" .. you get contributors slowly
    and surely by saying "here's what we've got so far, who wants to help us
    make this better?" 

I think if we market the initial Lucy release as "help us get to a stable
release", then A) people will be more forgiving of our "work in progress", and
B) we may attract more contributors.  We can put off the more disruptive work
until later.

Sound like a plan?

Marvin Humphrey

View raw message