lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yonik Seeley (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2829) Filter queries have false-positive matches. Exposed by user's list titled "Regarding geodist and multiple location fields"
Date Wed, 02 Nov 2011 03:40:32 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141887#comment-13141887
] 

Yonik Seeley commented on SOLR-2829:
------------------------------------

bq. "&&" binds more tightly then "?:"

{code}
    return super.equals(other)
            && this.creator == null ? other.creator == null :
            this.creator.getClass() == other.creator.getClass();
{code}

Ouch!  Given that "creator" is never null (for trie fields at least) this always boils down
to just comparing the creator class.
What normally saves us is that the hash code will normally slot to a different bucket, and
the fact that we start off with a relatively large number of buckets (size=512, which means
1024 buckets when accounting for load factor and rounding up to the next power of two).

This is a bad bug since it can stay hidden and strike randomly.

bq. (unless you know of some reason why NumericFieldCacheSource should only care about equality
on this.creator.getClass() instead of this.creator ?)

I never fully grokked the creator stuff... and I understand the trunk fieldcache code is slated
to be replaced by the 3x fieldcache code, so I wouldn't worry about cleaning anything up (other
than making it work).
                
> Filter queries have false-positive matches. Exposed by user's list titled "Regarding
geodist and multiple location fields"
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-2829
>                 URL: https://issues.apache.org/jira/browse/SOLR-2829
>             Project: Solr
>          Issue Type: Bug
>          Components: search
>    Affects Versions: 3.4, 4.0
>         Environment: N/A
>            Reporter: Erick Erickson
>            Priority: Blocker
>             Fix For: 3.5
>
>         Attachments: SOLR-2829.patch, SOLR-2829.patch, SOLR-2829.patch
>
>
> I don't know how generic this is, whether it's just a
> problem with fqs when combined with spatial or whether
> it has wider applicability , but here's what I know so far.
> Marc Tinnemeyer in a post titled:
> "Regarding geodist and multiple location fields"
> outlines this. I checked this on 3.4 and trunk and it's
> weird in both cases.
> HOLD THE PRESSES:
> After looking at this a bit more, it looks like a caching
> issue, NOT a geodist issue. When I bounce Solr
> between changing the sfield from "home" to "work",
> it seems to work as expected.
> Hmmmm, very strange. If I comment out BOTH
> the filterCache and queryResultCache then it works
> fine. Switching from "home" to "work" in the query
> finds/fails to find the document.
> But commenting out only one of those caches
> doesn't fix the problem.
> on trunk I used this query; just flipping "home" to "work" and back:
> http://localhost:8983/solr/select?q=id:1&fq={!geofilt sfield=home
> pt=52.67,7.30 d=5}
> The info below is what I used to test.
> From Marc's posts:
> <field name="home" type="location" indexed="true" stored="true"/>
> <field name="work" type="location" indexed="true" stored="true"/>
> <field name="elsewhere" type="location" indexed="true" stored="true"/>
> At first I thought so too. Here is a simple document.
> <add>
>       <doc>
>               <field name="id">1</field>
>               <field name="name">first</field>
>               <field name="work">48.60,11.61</field>
>               <field name="home">52.67,7.30</field>
>       </doc>
> </add>
> and here is the result that shouldn't be:
> <response>
> ...
> <str name="q">*:*</str>
> <str name="fq">{!geofilt sfield=work pt=52.67,7.30 d=5}</str>
> ...
> </lst>
> </lst>
> <result name="response" numFound="1" start="0">
> <doc>
> <str name="home">52.67,7.30</str>
> <str name="id">1</str>
> <str name="name">first</str>
> <str name="work">48.60,11.61</str>
> </doc>
> </result>
> </response>
> ****Yonik's comment******
> It's going to be a bug in an equals() implementation somewhere in the query.
> The top level equals will be SpatialDistanceQuery.equals() (from
> LatLonField.java)
> On trunk, I already see a bug introduced when the new lucene field
> cache stuff was done.
> DoubleValueSource now just inherits it's equals method from
> NumericFieldCacheSource... and that equals() method only tests if the
> CachedArrayCreator.getClass() is the same!  That's definitely wrong.
> I don't know why 3x would also have this behavior (unless there's more
> than one bug!)
> Anyway, first step is to modify the spatial tests to catch the bug...
> from there it should be pretty easy to debug.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message