lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <>
Subject [jira] Updated: (LUCENE-1410) PFOR implementation
Date Fri, 03 Oct 2008 11:26:44 GMT


Michael McCandless updated LUCENE-1410:


Paul, I'm eager to test pfor on real Lucene vInts.

So I created a simple test program (attached  Run it
like this:

  Usage: java org.apache.lucene.util.pfor.TestPFor2 <indexDirName> <vIntFileNameIn>
  Eg: java org.apache.lucene.util.pfor.TestPFor2 /lucene/index _l.prx _l.prx.pfor

where indexDirName is the directory of a Lucene index, vIntFileNameIn
should be a file that just has a bunch of vInts (Lucene's *.frq and
*.prx fit this) and pForFileNameOut is a file it writes with blocks
encoded in PFor.

It first encodes the vInts from vIntFileNameIn into pfor blocks
written to pForFileNameOut.  Then it measures decode time of reading
all vInts from vIntFileNameIn vs decode time of reading all pfor
blocks.  It runs each round 5 times.

The test has certain serious flaws:

  * Can you add a method that figures out the right frame size to use
    for a given block of ints (so that ~90% of the ints are < N bits)?

  * I'm using fixed 6-bit frame size.  Can you add bigger bit sizes to
    your pfor decompress?

With these fixes the test would be more fair to pfor.

In the PFor file that I write, I simply write an int (# bytes)
followed by the bytes, for each block.  Then when reading these blocks
I read the #bytes, then read into the byte array backing the intArray
passed to the PFor for decompress.  In a real integration I think
writing the int #bytes should be unecessary (is pfor self

This is inefficient because in doing this for real we should avoid the
double-copy of the byte[].  In fact, we might push it even lower
(under IndexInput, eg, create a IntBlockIndexInput) to possibly avoid
the copy into byte[] in BufferedIndexInput by maybe using direct
ByteBuffers from the OS.  So even once we fix the top two issues
above, the results of a "real" integration should be still faster.

I ran this on a 622 MB prx file from a Wikipedia index, and even with
the above 2 limitations it's still a good amount faster:

java org.apache.lucene.util.pfor.TestPFor2 /lucene/big _l.prx _l.prx.pfor

encode _l.prx to _l.prx.pfor...
442979072 vints; 888027375 bytes compressed vs orig size 651670377

decompress using pfor:
4198 msec to decode 442979072 vInts (105521 vInts/msec)
4332 msec to decode 442979072 vInts (102257 vInts/msec)
4165 msec to decode 442979072 vInts (106357 vInts/msec)
4122 msec to decode 442979072 vInts (107467 vInts/msec)
4061 msec to decode 442979072 vInts (109081 vInts/msec)

decompress using readVInt:
7315 msec to read 442979104 vInts (60557 vInts/msec)
7390 msec to read 442979104 vInts (59943 vInts/msec)
5816 msec to read 442979104 vInts (76165 vInts/msec)
5937 msec to read 442979104 vInts (74613 vInts/msec)
5970 msec to read 442979104 vInts (74200 vInts/msec)

It's really weird how the time gets suddenly faster during readVInt.
It's very repeatable. on another machine I see it get suddenly slower
starting at the same (3rd) round.  Adding -server and -Xbatch doesn't
change this behavior.  This is with (build 1.6.0_10-rc-b28) on Linux
and (build 1.6.0_05-b13-120) on Mac OS X 10.5.5.

> PFOR implementation
> -------------------
>                 Key: LUCENE-1410
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: LUCENE-1410b.patch,
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
> Implementation of Patched Frame of Reference.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message