lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Adrien Grand (JIRA)" <>
Subject [jira] [Commented] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
Date Thu, 09 Aug 2012 09:45:19 GMT


Adrien Grand commented on LUCENE-3892:

The comment you added in 1371011 on the value of {{BLOCK_SIZE}} caught my attention: I think
that BLOCK_SIZE should be at least 64 with PackedInts encoding/decoding since these conversions
are long-aligned (I backported your two commits and added a comment about this). For example,
the {{PACKED}} 7-bits encoder cannot encode less than 64 values in one iteration.

In case someone would really want to use smaller block sizes (eg. 32), I think it should still
perform pretty well if {{acceptableOverheadRatio >= ~25%}} (in that case, all bits-per-value
in the [1-24] range either use a {{PACKED_SINGLE_BLOCK}} encoder or an 8-bits, 16-bits or
24-bits {{PACKED}} decoder).

Do we plan to make the block size configurable?
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>                 Key: LUCENE-3892
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: 4.1
>         Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch,
LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch,
LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch,
LUCENE-3892-bulkVInt.patch, LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch,
LUCENE-3892-handle_open_files.patch, LUCENE-3892-non-specialized.patch, LUCENE-3892-pfor-compress-iterate-numbits.patch,
LUCENE-3892-pfor-compress-slow-estimate.patch, LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch,
LUCENE-3892_for_unfold_method.patch, LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch,
LUCENE-3892_settings.patch, LUCENE-3892_settings.patch
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> ).
> I think this would make a good GSoC project.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message