lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Elschot (JIRA)" <>
Subject [jira] Commented: (LUCENE-1410) PFOR implementation
Date Fri, 02 Apr 2010 19:08:27 GMT


Paul Elschot commented on LUCENE-1410:

I think the mixed performance results for decompression and query times may be caused by the
use of only a single method. For very short sequences (1 to 2 or 3 integers), I would expect
VInt (actually VByte) to perform best. For long sequences (from about 25 integers) , (P)FOR
should do best. In between the two, (a variant of) S9.
The problem will be to find the optimal bordering sequence sizes to change the compression

The fact that S9 is already doing better than VInt is encouraging. Since (P)FOR can do even
better than S9, when using (P)FOR only for longer sequences, I'd expect a real performance
boost for queries using frequently occurring terms in the index.

Also, I'd recommend to verify query results for each method. S9 as I implemented it is only
tested by its own test cases.
When the query results are incorrect, measuring performance is not really useful, and this
has happened already for the PFOR implementation here, see above in early October 2008.

> PFOR implementation
> -------------------
>                 Key: LUCENE-1410
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Other
>            Reporter: Paul Elschot
>            Priority: Minor
>         Attachments: autogen.tgz, for-summary.txt, LUCENE-1410-codecs.tar.bz2, LUCENE-1410b.patch,
LUCENE-1410c.patch, LUCENE-1410d.patch, LUCENE-1410e.patch, TermQueryTests.tgz,,,
>   Original Estimate: 21840h
>  Remaining Estimate: 21840h
> Implementation of Patched Frame of Reference.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message