lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-4769) Add a CountingFacetsAggregator which reads ordinals from a cache
Date Mon, 11 Feb 2013 20:59:13 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-4769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13576119#comment-13576119
] 

Michael McCandless commented on LUCENE-4769:
--------------------------------------------

Full (6.6M docs) wikibig, 9 dims.  Base is trunk, comp is int[] cache:
{noformat}
                    Task    QPS base      StdDev    QPS comp      StdDev                Pct
diff
                 Respell       44.87      (3.2%)       44.32      (3.8%)   -1.2% (  -7% -
   5%)
              AndHighLow       64.10      (1.8%)       63.33      (1.1%)   -1.2% (  -4% -
   1%)
             LowSpanNear        7.10      (2.1%)        7.18      (2.0%)    1.1% (  -2% -
   5%)
                  Fuzzy2       28.55      (2.0%)       29.06      (1.6%)    1.8% (  -1% -
   5%)
              HighPhrase       13.69      (8.7%)       13.98      (8.3%)    2.1% ( -13% -
  20%)
         LowSloppyPhrase       15.40      (2.6%)       15.85      (1.6%)    3.0% (  -1% -
   7%)
              AndHighMed       39.48      (1.0%)       41.07      (0.8%)    4.0% (   2% -
   5%)
            HighSpanNear        2.91      (1.3%)        3.03      (1.1%)    4.2% (   1% -
   6%)
               LowPhrase       15.01      (4.4%)       15.80      (4.7%)    5.2% (  -3% -
  14%)
             MedSpanNear       17.68      (1.4%)       18.87      (1.2%)    6.7% (   3% -
   9%)
         MedSloppyPhrase       16.56      (1.3%)       17.82      (1.3%)    7.6% (   5% -
  10%)
               MedPhrase       41.08      (2.5%)       44.27      (2.8%)    7.8% (   2% -
  13%)
                  Fuzzy1       24.08      (1.5%)       25.97      (1.9%)    7.9% (   4% -
  11%)
        HighSloppyPhrase        0.82      (6.2%)        0.89      (5.6%)    9.1% (  -2% -
  22%)
                 LowTerm       34.22      (1.1%)       39.13      (1.3%)   14.3% (  11% -
  16%)
             AndHighHigh       11.87      (1.3%)       13.96      (1.2%)   17.6% (  14% -
  20%)
                Wildcard       13.02      (1.9%)       16.02      (1.5%)   23.1% (  19% -
  27%)
                 MedTerm       20.04      (2.2%)       24.86      (2.4%)   24.1% (  19% -
  29%)
               OrHighMed        7.02      (2.7%)        9.85      (2.3%)   40.3% (  34% -
  46%)
               OrHighLow        7.08      (2.8%)        9.95      (2.8%)   40.5% (  33% -
  47%)
                HighTerm        7.52      (2.5%)       10.78      (2.1%)   43.4% (  37% -
  49%)
                 Prefix3        5.71      (2.4%)        8.51      (1.5%)   48.9% (  43% -
  54%)
              OrHighHigh        3.99      (2.3%)        6.26      (2.6%)   56.8% (  50% -
  63%)
                  IntNRQ        1.91      (2.3%)        3.66      (3.0%)   91.9% (  84% -
  99%)
{noformat}

Cache is 291.5 MB, and trunk is 129.0 MB = 2.26X larger.

                
> Add a CountingFacetsAggregator which reads ordinals from a cache
> ----------------------------------------------------------------
>
>                 Key: LUCENE-4769
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4769
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/facet
>            Reporter: Shai Erera
>            Assignee: Shai Erera
>         Attachments: LUCENE-4769.patch
>
>
> Mike wrote a prototype of a FacetsCollector which reads ordinals from a CachedInts structure
on LUCENE-4609. I ported it to the new facets API, as a FacetsAggregator. I think we should
offer users the means to use such a cache, even if it consumes more RAM. Mike tests show that
this cache consumed x2 more RAM than if the DocValues were loaded into memory in their raw
form. Also, a PackedInts version of such cache took almost the same amount of RAM as straight
int[], but the gains were minor.
> I will post the patch shortly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message