lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Wang" <john.w...@gmail.com>
Subject Re: docid set compression and boolean docid set operations
Date Wed, 17 Sep 2008 06:12:17 GMT
Paul and Eks:

        Because this is being hosted on code.google.com, it requires proj.
members to have gmail accts. Can you guys send me yours?

        Also, we are developing on the
BR_DEV_1_0_4<http://code.google.com/p/lucene-ext/source/browse/#>branch.

Thanks

-John

On Mon, Sep 15, 2008 at 1:24 PM, eks dev <eksdev@yahoo.co.uk> wrote:

> >    Are you guys interested in helping out on kamikaze?
>
> sure, as much as my schedule permits (slow but steady :).
>
> p4delta should be the right way to go. My proposal would be to try to make
> as fast as it goes (Paul's comment about if in decompression loop...) for
> the simplest case, set of integers (longs?) and make it available as one of
> Filter implementations (fast track commit into lucene and bigger exposure to
> others that can make it better).
> The code in kamikaze is probably almost there (not looked into it yet), we
> just need to make an issue in JIRA  and provide simple patch for basic
> p4delta functionality and one Filter implementation with a few test cases.
> So it becomes usefull for the comunity from the very start (also possibility
> to mix it with Pauls code on DocIdSetIterators...).
>
> This will not bring huge benefit but is definitly usfull option for
> Filters. The real work then is to make use of it for on disk format and
> replace partially VInt encoding for more involved cases I meantoned before
> like (docId delta, term frequency) pairs in lucene with multilevel skipping
> information.
>
> As Paul said, small steps :)
>
> cheers,
> eks
>
>

Mime
View raw message