lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johan Stuyts (JIRA)" <>
Subject [jira] Commented: (LUCENE-639) [PATCH] Slight performance improvement for readVInt() of IndexInput
Date Tue, 01 Aug 2006 07:06:15 GMT
    [ ] 
Johan Stuyts commented on LUCENE-639:

Yonik, I think it is possible to make a reasonable assumption about the distribution of the
vints. Lucene stores deltas between document IDs instead of document IDs, and I guess (no
data available) most frequencies will be below 128.

In the example given below I take the bit for the frequency that is stored with the document
IDs delta into account.

Suppose you have 50 million documents. With a term occurring in 1 in 50 documents 1 million
1-byte vints are needed. With a term occurring 1 in 2000 documents 25.000 2-byte vints are
needed. With an occurrence of 1 in 100.000 documents 500 3-byte vints are needed. Together
with my guess about the frequencies many more 1-byte and 2-byte vints than 3-byte and longer
vints have to be read. So the algorithm for readVInt() should at least perform very good for
1-byte and 2-byte vints.

To detemine whether or not it is worth improving the algorighm, I wrote an isolated test:
no I/O, as little GC as possible, etc. See the attached Java file.

I tested four algorithms:
* the current one;
* the unrolled one of this issue;
* a new one I came up with that uses exclusive or (see below);
* the one from Yonik.

The exclusive or algorithm started with the same assumption as Yonik abou the first byte:
you can exit after only a read when no more bytes follow. The I tried to think of a way to
reduce the number of bitwise operations. This line of thinking was based on the code of cardinality()
of java.util.BitSet, which uses special bit patterns to count the bits in a long as fast as
possible. This is the algorithm I ended up with:
    int i = readByte();
    if (i < 0)
      // Second byte present
      i ^= readByte() << 7;
      if (i >= 0)
        // Third byte present
        i ^= readByte() << 14;
        if (i < 0)
          // Fourth byte present
          i ^= readByte() << 21;
          if (i >= 0)
            // Fifth byte present
            i ^= readByte() << 28;
            return i ^ 0x0FE03F80;
          // No fifth byte
          return i ^ 0xFFE03F80;
        // No fourth byte
        return i ^ 0x00003F80;
      // No third byte
      return i ^ 0xFFFFFF80;
    return i;

In the tests all values returned by algorithms are verified to ensure an optimization does
not cause invaliid values to be returned.

The statistics of the test are:
* 6 sets of data with 38,4 million integers per set: random size, 1-byte, 2-byte, 3-byte,
4-byte and 5-byte;
* 1 warming up and 3 individually measured loops per data set;
* per loop each algorithm reads the data set 5 times consecutively.

In total each algorithm reads 38,4 million integers * 6 data sets * 4 loops * 5 reads = over
4 billion integers per test run. Over 18 billion integers are read in the total test.

The tests are run in a single JVM for all algorithms, data sets, loops, etc.

I ran the tests on 7 JVMs. Some JVMs have a client and a server variant, resulting in 12 JVM
* BEA JDK 1.4.2 client;
* BEA JDK 1.4.2 server;
* BEA JDK 1.5.0 client;
* BEA JDK 1.5.0 server;
* IBM JRE 1.4.2;
* IBM JDK 1.5.0;
* Sun JDK 1.4.2 client;
* Sun JDK 1.4.2 server;
* Sun JDK 1.5.0 client;
* Sun JDK 1.5.0 server;
* Sun JDK 1.6.0b2 client;
* Sun JDK 1.6.0b2 server.

I disconnected all networks on the laptop I ran the tests on. As litle services as possible
were running and the firewall was also shutdown. The only applications that were running were
Eclipse and the test. The screen saver and energy saving were disabled.

Here is the configuration used to run the tests:
* Intel Pentium M 1.73 GHz;
* 1 GB RAM;
* Windows XP Pro SP 2;
* Lucene trunk;
* Eclipse 3.1.2 Java compiler.

Please note that during the running of the tests some disk I/O was observed. Even though I
believe that no swapping would be needed, swapping might have occurred.

After running all the tests I:
* removed the slowest result of each algorithm for each JVM and dataset combination;
* averaged the two remaining results;
* divided all results by the time of the current algorithm;
* selected the 'best' algorithms for each JVM.

The best algorithm is defined as: ((1-byte time - best 1-byte time) + (2-byte time - best
2-byte time)) < 0,02. This means that the total difference between the best 1-byte and
2-byte times cannot be greater than 2 % to be considered the best.

Here are the results, best algorithm first. See the attached PDF for more information:
* exclusive or;
* combine on exit (Yonik's one);
* unrolled;
* current.

Across all JVMs the exclusive or algorithm saves 3,5-4,8 %.

Yonik already mentioned this at the URL above, a lot depends on the JVM used to run the tests.
You should do tests with the JVM on which you are going to deploy. A colleague of mine also
mentioned that the compiler used could influence the results. Unfortunately I do not have
the time to test all combinations of compilers and JVMs.

> [PATCH] Slight performance improvement for readVInt() of IndexInput
> -------------------------------------------------------------------
>                 Key: LUCENE-639
>                 URL:
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Index
>    Affects Versions: 2.0.0
>            Reporter: Johan Stuyts
>            Priority: Minor
>         Attachments: Lucene2ReadVIntPerformance.patch
> By unrolling the loop in readVInt() I was able to get a slight, about 1.8 %, performance
improvement for this method. The test program invoked the method over 17 million times on
each run.
> I ran the performance tests on:
> - Windows XP Pro SP2
> - Sun JDK 1.5.0_07
> - YourKit 5.5.4
> - Lucene trunk

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message