lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Wartes (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SOLR-9125) CollapseQParserPlugin allocations are index based, not query based
Date Tue, 17 May 2016 16:05:12 GMT
Jeff Wartes created SOLR-9125:
---------------------------------

             Summary: CollapseQParserPlugin allocations are index based, not query based
                 Key: SOLR-9125
                 URL: https://issues.apache.org/jira/browse/SOLR-9125
             Project: Solr
          Issue Type: Improvement
          Components: query parsers
            Reporter: Jeff Wartes
            Priority: Minor


Among other things, CollapsingQParserPlugin’s OrdScoreCollector allocates space per-query
for: 
1 int (doc id) per ordinal
1 float (score) per ordinal
1 bit (FixedBitSet) per document in the index
 
So the higher the cardinality of the thing you’re grouping on, and the more documents in
the index, the more memory gets consumed per query. Since high cardinality and large indexes
are the use-cases CollapseQParserPlugin was designed for, I thought I'd point this out.

My real issue is that this does not vary based on the number of results in the query, either
before or after collapsing, so a query that results in one doc consumes the same amount of
memory as one that returns all of them. All of the Collectors suffer from this to some degree,
but I think OrdScore is the worst offender.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message