lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Wang <>
Subject Re: lucene cutomized indexing
Date Tue, 20 Jul 2004 16:12:21 GMT
Hi Daniel:

     There are few things I want to do to be able to customize lucene:

1) to be able to plug in a different similarity model (e.g. bayesian,
vector space etc.)

2) to be able to store certain fields in its own format and provide
corresponding readers. I may not want to store every field in the
lexicon/inverted index structure. I may have fields that doesn't make
sense to store the position or frequency information.

3) to be able to customize analyzers to add more information to the
Token while doing tokenization.

Oleg mentioned about the HayStack project. In the HayStack source
code, they had to modifiy many lucene class to make them non-final in
order to customzie. They make sure during deployment their "versions"
gets loaded before the same classes in the lucene .jar. It is
cumbersome, but it is a Lucene restriction they had to live with.

I believe there are many other users feel the same way. 

If I write some classes that derives from the lucene API and it
breaks, then it is my responsibility to fix it. I don't understand why
it would add burden to the Lucene developers.



On Tue, 20 Jul 2004 17:56:26 +0200, Daniel Naber
<> wrote:
> On Tuesday 20 July 2004 17:28, John Wang wrote:
> >    I have asked to make the Lucene API less restrictive many many many
> > times but got no replies.
> I suggest you just change it in your source and see if it works. Then you can
> still explain what exactly you did and why it's useful. From the developers
> point-of-view having things non-final means more stuff is exposed and making
> changes is more difficult (unless one accepts that derived classes may break
> with the next update).
> Regards
> Daniel
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message