Mailing-List: contact lucy-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: lucy-dev@lucene.apache.org
Received-SPF: neutral (athena.apache.org: local policy)
Message-ID: <65d3176c0809212335l64baebfw5abc259fbc417f8d@mail.gmail.com>
Date: Mon, 22 Sep 2008 00:35:53 -0600
From: "Nathan Kurz" <nate@verse.com>
To: lucy-dev@lucene.apache.org
Subject: Re: Posting codecs
In-Reply-To: <FDBA3210-1DB0-4D26-9B3B-1305D4E2BD56@rectangular.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
References: <f31e99f40809131356w402aa18avf97ed6471301123@mail.gmail.com>
	 <581D4B54-2ABA-4DEA-AC5A-78417E3348A9@rectangular.com>
	 <65d3176c0809142202q39d94a83o985df3c95a7867a5@mail.gmail.com>
	 <f31e99f40809171123j39032e94q3966631889634593@mail.gmail.com>
	 <65d3176c0809171316s31c65048ue6afcb540d845694@mail.gmail.com>
	 <7AFF05A2-FE1E-4A35-B3AE-E111DD10F068@rectangular.com>
	 <65d3176c0809191125g3c7fd1fav88a4e7c86f9b41ec@mail.gmail.com>
	 <FDBA3210-1DB0-4D26-9B3B-1305D4E2BD56@rectangular.com>

On Fri, Sep 19, 2008 at 8:51 PM, Marvin Humphrey <marvin@rectangular.com> wrote:
> Let's set a goal of implementing PForDelta, and figure out how to get there
> from here.

Sure, that seems like a fine goal.  I'm not sure if you meant it this
way, but I think it would be great to add support for PForDelta to the
existing VByte support rather than just replacing it.  While PForDelta
might fully replace VByte eventually, it would be good to design an
architecture that allows for multiple file formats, so that tricks
like using Lucene index files directly are theoretically possible, and
so that the path for creating custom file formats is better tested.

>> You need to figure out some to decode the entire posting with fewer
>> function calls, and this is where the bulk of them are coming from.
>> I'd suggest having the Store level return an undecoded raw posting,
>> and let the Posting class decode the parts it wants.  That way the
>> VByte code can be macros that work on a local buffer in a tight loop.
>
> It's a little tricky to give anything other than the InStream object itself
> direct access to the raw buffer.  It's inevitable that the bytes that form a
> compressed integer will straddle a buffer boundary from time to time.

OK.  I wasn't hoping to give access to the raw buffer, but to a cooked
one guaranteed to contain the entire posting.  But I'd forgotten that
the size of the compressed posting isn't known in advance!  The
important part, though, is that the entire posting be decompressed in
a tight loop with minimal function call overhead.  So instead of my
first suggestion, can we have the Store return a  fully decoded
Posting?

More fully fleshed out  (up to partially-informed), I'd like for a
system that looks something like this:

1) Scorer->init creates a template Posting which is passed to the
Store for approval.  Store initializes itself and answers
affirmatively if it can handle the particular flavor of Posting
proposed.  For example, Store->init_posting(PositionPosting) might be
refused by a Store than only handles MatchPostings due to index
limitations.  Alternatively, a Store managing an index that has
position information available might re-wire itself to use a faster
method to fill a MatchPosting than a full PositionPosting.

2) Scoring loop starts and Scorer calls
Store->fill_posting(PositionPosting, min_doc_number) .  This is
analogous to Scorer->SkipTo(), but decodes the entire posting from the
index in a single pass.  The Posting has nominal control over its own
fields (resizes its own position buffer, for example) but the Store is
allowed to write data directly to the struct and the Scorer reads
fields directly from them.

My goals here are to have a system that dramatically speeds up the the
reads from the index, makes it possible for existing Scorers to work
with new index formats, and allows for easy creation of custom Scorers
for use with custom indexes.  My guess is that a move to this system
would give an immediate 2x speedup, while making it easier to add
custom features and future optimizations.  I think the main downside
is that it violates most people's conception of a standard Object
Oriented Model.   Personally, my main doubts are about the
initialization/negotiation phase --- perhaps there is some better way
to do this?

And I apologize for my ad hoc terminology.  It's partly that I'm not
familiar with the proper Kinosearch terminology, and partly that the
levels that are there don't quite map up with my mental model.  What I
want is for the Posting to be an intermediary between the Scorer and
the Store, where the Posting has a known data layout and only the
Store knows about the index internals, be it using VByte or PForDelta.
 I'd be happy to call the Store an Index, apart from the confusion
between the file and the class, but I'm not comfortable viewing it as
a Stream.  Is there a more standard way to express this?

>> The last thing (sort of a non-thing) is that to my surprise the
>> double/triple buffering of the Posting data doesn't seem to have much
>> of a negative effect.  I still think it's worth trying to avoid this,
>> but it's barely on the current radar.
>
> The double/triple buffering may not cause a slowdown, but it's worth going
> with mmap for the backing buffer on InStream simply for memory footprint.
>  Since all index files are read-only once committed, they can be shared.
>  That means if you have multiple processes with multiple IndexReaders all
> reading the same file, they can all effectively use the same backing buffer,
> and we'll let the OS manage page swapping.

I agree that there a multiple other reasons for using mmap here.  I
definitely want a design that makes it easy to drop in mmap versions
of the index readers for all the reasons that you specify.  I'm also
hoping that we can make the rest of the search process efficient
enough that the savings of moving to single buffering does become
worthwhile.  But we have a ways to go until that is the case:  maybe a
8x speedup from the recent baseline before it starts to be a limiting
factor?  I'm optimistic that this level of speedup is possible,
though.

Nathan Kurz
nate@verse.com