lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8741) Json Facet API, numBuckets not returning real number of buckets.
Date Sat, 27 Feb 2016 22:54:18 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15170752#comment-15170752
] 

Yonik Seeley commented on SOLR-8741:
------------------------------------

IIRC numBuckets is using the same estimation algorithm used for "unique" described here: http://yonik.com/solr-count-distinct/
before hyperloglog got added.

We should prob add some way to use hll for numBuckets as well, but for now you may be able
to work around by using hll directly yourself.

Example:
{code}
json.facet={
  numCat:"hll(cat)",
  categories: {
    type : terms,
    field : cat
  }
}'
{code}

That should work for the common case, but not for other cases like mincount=N (where N>1)
for example, or for other domain switching techniques like block join.

> Json Facet API, numBuckets not returning real number of buckets.
> ----------------------------------------------------------------
>
>                 Key: SOLR-8741
>                 URL: https://issues.apache.org/jira/browse/SOLR-8741
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Pablo Anzorena
>
> Hi, using the json facet api I realized that the numBuckets is wrong. It is not returning
the right number of buckets. I have a dimension which numBuckets says it has 1340, but when
retrieving all the results it brings 988. 
> FYI the field is of type string.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message