lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Wang" <>
Subject Re: docid set compression and boolean docid set operations
Date Wed, 17 Sep 2008 06:12:17 GMT
Paul and Eks:

        Because this is being hosted on, it requires proj.
members to have gmail accts. Can you guys send me yours?

        Also, we are developing on the



On Mon, Sep 15, 2008 at 1:24 PM, eks dev <> wrote:

> >    Are you guys interested in helping out on kamikaze?
> sure, as much as my schedule permits (slow but steady :).
> p4delta should be the right way to go. My proposal would be to try to make
> as fast as it goes (Paul's comment about if in decompression loop...) for
> the simplest case, set of integers (longs?) and make it available as one of
> Filter implementations (fast track commit into lucene and bigger exposure to
> others that can make it better).
> The code in kamikaze is probably almost there (not looked into it yet), we
> just need to make an issue in JIRA  and provide simple patch for basic
> p4delta functionality and one Filter implementation with a few test cases.
> So it becomes usefull for the comunity from the very start (also possibility
> to mix it with Pauls code on DocIdSetIterators...).
> This will not bring huge benefit but is definitly usfull option for
> Filters. The real work then is to make use of it for on disk format and
> replace partially VInt encoding for more involved cases I meantoned before
> like (docId delta, term frequency) pairs in lucene with multilevel skipping
> information.
> As Paul said, small steps :)
> cheers,
> eks

View raw message