lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SOLR-3793) duplicate (deleted) documents included in result set when using field faceting with fq
Date Thu, 06 Sep 2012 11:43:08 GMT

     [ https://issues.apache.org/jira/browse/SOLR-3793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yonik Seeley updated SOLR-3793:
-------------------------------

    Attachment: SOLR-3793.patch

Here's a patch to fix the problem.

The issue was when UnInvertedField faceting cached big terms as filters, it failed to set/use
liveDocs.  Later, an "fq" was used that retrieved the filter from the cache and used that
filter as liveDocs, bringing deleted docs back from the dead. 
                
> duplicate (deleted) documents included in result set when using field faceting with fq
> --------------------------------------------------------------------------------------
>
>                 Key: SOLR-3793
>                 URL: https://issues.apache.org/jira/browse/SOLR-3793
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.0-BETA
>            Reporter: Hoss Man
>            Assignee: Yonik Seeley
>            Priority: Blocker
>             Fix For: 4.0
>
>         Attachments: SOLR-3793.patch
>
>
> Günter Hipler reported on the solr-user mailing list that he was seeing inconsistencies
in facet counts compared to the numFound when drilling down onto those facets (using "fq")
- in particular: when adding an "fq" such as `fq={!term+f%3DnavNetwork}nebis`, the resulting
numFound was higher then the number of docs reported by the facet constraint for nebis in
the base request.
> I've been able to trivially reproduce this using the example data from Solr 4.0-BETA,
trunk@r1381400, and branch_4x@r1381400 (details in comment to follow)
> Important things to note from Günter's email thread with his assessment of the problem...
> https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201208.mbox/%3CCAM_U7jfDpNrGfmWmNtNACHCDCJw4YB-rLBBvRW_WP_jdOb_cgw@mail.gmail.com%3E
> bq. The behaviour is not consistent. Some of the facets provide the correct result, some
not.  What I can't say for sure: The behaviour was correct (if I'm not wrong) once the whole
index was newly created. After running some updates I got these results.
> bq. I'm going to setup a new index with the Lucene 4.0 version from March (to be more
exactly: it's version 4.0-2012-03-09_11-29-20) to see what are the results even in case of
frequent updates ... the version deployed in march doesn't contain the error I now come across
in Beta4.0 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message