lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allison, Timothy B." <talli...@mitre.org>
Subject RE: Complexphrase treats wildcards differently than other query parsers
Date Fri, 06 Oct 2017 18:54:44 GMT
That could be it.  I'm not able to reproduce this with trunk.  More next week.

In trunk, if I add this to schema15.xml:
  <fieldType name="text_iso_latin1_mapping" class="solr.TextField">
    <analyzer>
      <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/>
      <tokenizer class="solr.MockTokenizerFactory"/>
    </analyzer>
  </fieldType>
  <field name="iso-latin1" type="text_iso_latin1_mapping" indexed="true" stored="true"/>

This test passes.

  @Test
  public void testCharFilter() {
    assertU(adoc("iso-latin1", "cr\u00E6zy tr\u00E6n", "id", "1"));
    assertU(commit());
    assertU(optimize());

    assertQ(req("q", "{!complexphrase} iso-latin1:craezy")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:traen")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:caezy~1")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:crae*")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:*aezy")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:crae*y")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"craezy traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"caezy~1 traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"craez* traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"*aezy traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );

    assertQ(req("q", "{!complexphrase} iso-latin1:\"crae*y traen\"")
        , "//result[@numFound='1']"
        , "//doc[./str[@name='id']='1']"
    );
  }



-----Original Message-----
From: Bjarke Buur Mortensen [mailto:mortensen@eluence.com] 
Sent: Friday, October 6, 2017 6:46 AM
To: solr-user@lucene.apache.org
Subject: Re: Complexphrase treats wildcards differently than other query parsers

Thanks a lot for your effort, Tim.

Looking at it from the Solr side, I see some use of local classes. The snippet below in particular
caught my eye (in solr/core/src/java/org/apache/solr/search/ComplexPhraseQParserPlugin.java).
The instance of ComplexPhraseQueryParser is not the clean one from Lucene, but a modified
one. If any of the modifications messes with the analysis logic, well then that might answer
it.

What do you make of it?

lparser = new ComplexPhraseQueryParser(defaultField, getReq().getSchema().
getQueryAnalyzer())
{
protected Query newWildcardQuery(org.apache.lucene.index.Term t) { try { org.apache.lucene.search.Query
wildcardQuery = reverseAwareParser.
getWildcardQuery(t.field(), t.text());
setRewriteMethod(wildcardQuery);
return wildcardQuery;
} catch (SyntaxError e) {
throw new RuntimeException(e);
}
}
private Query setRewriteMethod(org.apache.lucene.search.Query query) { if (query instanceof
MultiTermQuery) {
((MultiTermQuery) query).setRewriteMethod( org.apache.lucene.search.MultiTermQuery.SCORING_BOOLEAN_REWRITE);
}
return query;
}
protected Query newRangeQuery(String field, String part1, String part2, boolean startInclusive,
boolean endInclusive) { boolean reverse = reverseAwareParser.isRangeShouldBeProtectedFromReverse(field,
part1);
return super.newRangeQuery(field,
reverse ? reverseAwareParser.getLowerBoundForReverse() : part1, part2, startInclusive || reverse,
endInclusive); } } ;

Thanks,
Bjarke


Mime
View raw message