lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Han Jiang (JIRA)" <>
Subject [jira] [Closed] (LUCENE-4283) Support more frequent skip with Block Postings Format
Date Tue, 21 Aug 2012 12:23:38 GMT


Han Jiang closed LUCENE-4283.

    Resolution: Later

We didn't get overall improvement with partial decode, and some patches here are more related
to "avoid skipper" rather than "more frequent skip", so this issue is closed for now :)
> Support more frequent skip with Block Postings Format
> -----------------------------------------------------
>                 Key: LUCENE-4283
>                 URL:
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Han Jiang
>            Priority: Minor
>         Attachments: LUCENE-4283-buggy.patch, LUCENE-4283-buggy.patch, LUCENE-4283-codes-cleanup.patch,
LUCENE-4283-record-next-skip.patch, LUCENE-4283-record-skip&inlining-scanning.patch, LUCENE-4283-slow.patch,
LUCENE-4283-small-interval-fully.patch, LUCENE-4283-small-interval-partially.patch
> This change works on the new bulk branch.
> Currently, our BlockPostingsFormat only supports skipInterval==blockSize. Every time
the skipper reaches the last level 0 skip point, we'll have to decode a whole block to read
doc/freq data. Also,  a higher level skip list will be created only for those df>blockSize^k,
which means for most terms, skipping will just be a linear scan. If we increase current blockSize
for better bulk i/o performance, current skip setting will be a bottleneck. 
> For ForPF, the encoded block can be easily splitted if we set skipInterval=32*k. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message