lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hoss Man (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-8496) Facet search count numbers are falsified by older document versions
Date Thu, 14 Jan 2016 21:36:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-8496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15098922#comment-15098922
] 

Hoss Man commented on SOLR-8496:
--------------------------------

Also: does this reproduce for you when indexing from scratch, or is this an index you originally
built with an older version of Solr and then upgraded to 5.4? (trying to figure out if there
are older segments and maybe the bug is specific to 5.4 reading deleted docs from those older
segments)

can you also run CheckIndex (command line) and provide all of that output?

> Facet search count numbers are falsified by older document versions
> -------------------------------------------------------------------
>
>                 Key: SOLR-8496
>                 URL: https://issues.apache.org/jira/browse/SOLR-8496
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 5.4
>         Environment: Linux 3.16.0-4-amd64 x86_64 Debian 8.2
> openjdk-7-jre-headless:amd64   version 7u91-2.6.3-1~deb8u1
> solr-5.4.0, extracted from official tar
> Default solr settings from install script:SOLR_HEAP="512m"
> GC_LOG_OPTS="-verbose:gc -XX:+PrintHeapAtGC -XX:+PrintGCDetails \
> -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime"
> GC_TUNE="-XX:NewRatio=3 \
> -XX:SurvivorRatio=4 \
> -XX:TargetSurvivorRatio=90 \
> -XX:MaxTenuringThreshold=8 \
> -XX:+UseConcMarkSweepGC \
> -XX:+UseParNewGC \
> -XX:ConcGCThreads=4 -XX:ParallelGCThreads=4 \
> -XX:+CMSScavengeBeforeRemark \
> -XX:PretenureSizeThreshold=64m \
> -XX:+UseCMSInitiatingOccupancyOnly \
> -XX:CMSInitiatingOccupancyFraction=50 \
> -XX:CMSMaxAbortablePrecleanTime=6000 \
> -XX:+CMSParallelRemarkEnabled \
> -XX:+ParallelRefProcEnabled"
> SOLR_OPTS="$SOLR_OPTS -Xss256k"
>            Reporter: Andreas Müller
>
> Our setup is based on multiple cores. In One core we have a multi-filed with integer
values. and some other unimportant fields. We're using multi-faceting for this field.
> We're querying a test scenario with:
> {code}
> http://localhost:8983/solr/core-name/select?q=dummyask: (true) AND manufacturer: false
AND id: (15039 16882 10850 20781)&fq={!tag=professions}professions: (59)&fl=id&wt=json&indent=true&facet=true&facet.field={!ex=professions}professions
> {code}
> - Query: (numDocs:48545, maxDoc:48545)
> {code:xml}
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> </lst>
> <result name="response" numFound="4" start="0">
> <doc>
> <int name="id">10850</int>
> </doc>
> <doc>
> <int name="id">16882</int>
> </doc>
> <doc>
> <int name="id">15039</int>
> </doc>
> <doc>
> <int name="id">20781</int>
> </doc>
> </result>
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">
> <lst name="professions">
> <int name="59">4</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> <lst name="facet_intervals"/>
> <lst name="facet_heatmaps"/>
> </lst>
> </response>
> {code}
> - Then we update one document and change some fields (numDocs:48545, maxDoc:48546) *The
number of maxDocs is increased*
> {code:xml}
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">1</int>
> </lst>
> <result name="response" numFound="4" start="0">
> <doc>
> <int name="id">10850</int>
> </doc>
> <doc>
> <int name="id">16882</int>
> </doc>
> <doc>
> <int name="id">15039</int>
> </doc>
> <doc>
> <int name="id">20781</int>
> </doc>
> </result>
> <lst name="facet_counts">
> <lst name="facet_queries"/>
> <lst name="facet_fields">
> <lst name="professions">
> <int name="59">5</int>
> </lst>
> </lst>
> <lst name="facet_dates"/>
> <lst name="facet_ranges"/>
> <lst name="facet_intervals"/>
> <lst name="facet_heatmaps"/>
> </lst>
> </response>
> {code}
> *The Problem:*
> In the first query, we're getting a facet count of 4, which is correct. After updating
one document, we're getting 5 as a result wich is not correct.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message