lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Han Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LUCENE-3892) Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
Date Mon, 30 Jul 2012 17:09:37 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-3892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Han Jiang updated LUCENE-3892:
------------------------------

    Attachment: LUCENE-3892-blockFor&packedecoder(comp).patch
                LUCENE-3892-blockFor&hardcode(base).patch

Previous experiments showed a net loss with packed ints API, however there're slight difference
e.g. all-value-the-same case is not handled equally. I suppose these two patches should make
the comparison fair enough. 

Base: BlockForPF + hardwired decoder
Comp: BlockForPF + PackedInts.Decoder
{noformat}
                Task    QPS base StdDev base    QPS comp StdDev comp      Pct diff
         AndHighHigh       25.66        0.31       22.61        1.21  -17% -   -6%
          AndHighMed       74.17        1.45       59.48        3.62  -26% -  -13%
              Fuzzy1       95.60        1.51       96.06        2.22   -3% -    4%
              Fuzzy2       28.67        0.50       28.51        0.75   -4% -    3%
              IntNRQ       33.31        0.60       30.73        1.51  -13% -   -1%
          OrHighHigh       17.58        0.59       16.22        1.18  -17% -    2%
           OrHighMed       34.42        0.93       32.14        2.33  -15% -    2%
            PKLookup      217.08        4.25      213.76        1.37   -4% -    1%
              Phrase        6.10        0.12        5.34        0.07  -15% -   -9%
             Prefix3       77.27        1.26       70.42        2.87  -13% -   -3%
             Respell       92.91        1.34       92.61        1.83   -3% -    3%
        SloppyPhrase        5.35        0.16        5.00        0.29  -14% -    1%
            SpanNear        6.05        0.15        5.47        0.07  -12% -   -6%
                Term       37.62        0.32       33.08        1.70  -17% -   -6%
        TermBGroup1M       17.45        0.64       16.40        0.73  -13% -    1%
      TermBGroup1M1P       25.20        0.69       23.47        1.24  -14% -    0%
         TermGroup1M       18.53        0.65       17.40        0.76  -13% -    1%
            Wildcard       44.39        0.49       40.51        1.69  -13% -   -3%
{noformat}
                
> Add a useful intblock postings format (eg, FOR, PFOR, PFORDelta, Simple9/16/64, etc.)
> -------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3892
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3892
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>              Labels: gsoc2012, lucene-gsoc-12
>             Fix For: 4.1
>
>         Attachments: LUCENE-3892-BlockTermScorer.patch, LUCENE-3892-blockFor&hardcode(base).patch,
LUCENE-3892-blockFor&packedecoder(comp).patch, LUCENE-3892-blockFor-with-packedints-decoder.patch,
LUCENE-3892-blockFor-with-packedints-decoder.patch, LUCENE-3892-blockFor-with-packedints.patch,
LUCENE-3892-direct-IntBuffer.patch, LUCENE-3892-for&pfor-with-javadoc.patch, LUCENE-3892-handle_open_files.patch,
LUCENE-3892-pfor-compress-iterate-numbits.patch, LUCENE-3892-pfor-compress-slow-estimate.patch,
LUCENE-3892_for_byte[].patch, LUCENE-3892_for_int[].patch, LUCENE-3892_for_unfold_method.patch,
LUCENE-3892_pfor_unfold_method.patch, LUCENE-3892_pulsing_support.patch, LUCENE-3892_settings.patch,
LUCENE-3892_settings.patch
>
>
> On the flex branch we explored a number of possible intblock
> encodings, but for whatever reason never brought them to completion.
> There are still a number of issues opened with patches in different
> states.
> Initial results (based on prototype) were excellent (see
> http://blog.mikemccandless.com/2010/08/lucene-performance-with-pfordelta-codec.html
> ).
> I think this would make a good GSoC project.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message