lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "hao yan (JIRA)" <>
Subject [jira] Created: (LUCENE-2903) Improvement of PForDelta Codec
Date Tue, 01 Feb 2011 22:29:29 GMT
Improvement of PForDelta Codec

                 Key: LUCENE-2903
             Project: Lucene - Java
          Issue Type: Improvement
            Reporter: hao yan

There are 3 versions of PForDelta implementations in the Bulk Branch: FrameOfRef, PatchedFrameOfRef,
and PatchedFrameOfRef2.

The FrameOfRef is a very basic one which is essentially a binary encoding (may result in huge
index size).
The PatchedFrameOfRef is the implmentation based on the original version of PForDelta in the
The PatchedFrameOfRef2 is my previous implementation which are improved this time. (The Codec
name is changed to NewPForDelta.).

In particular, the changes are:
1. I fixed the bug of my previous version (in Lucene-1410.patch), where the old PForDelta
does not support very large exceptions (since
the Simple16 does not support very large numbers). Now this has been fixed in the new LCPForDelta.

2. I changed the PForDeltaFixedIntBlockCodec. Now it is faster than the other two PForDelta
implementation in the bulk branch (FrameOfRef and PatchedFrameOfRef). The codec's name is
"NewPForDelta", as you can see in the CodecProvider and PForDeltaFixedIntBlockCodec.

3. The performance test results are:
1) My "NewPForDelta" codec is faster then FrameOfRef and PatchedFrameOfRef for almost all
kinds of queries, slightly worse then BulkVInt.
2) My "NewPForDelta" codec can result in the smallest index size among all 4 methods, including
FrameOfRef, PatchedFrameOfRef, and BulkVInt, and itself)
3) All performance test results are achieved by running with "-server" instead of "-client"

This message is automatically generated by JIRA.
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message