lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Williams (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-629) Fuzzy search with DisMax request handler
Date Thu, 30 Apr 2009 02:51:30 GMT

    [ https://issues.apache.org/jira/browse/SOLR-629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12704450#action_12704450
] 

Chris Williams commented on SOLR-629:
-------------------------------------

Hi,
FYI: the patch didn't seem to apply cleanly on 1.3, but worked fine on 1.4

Anyways, I'm having some trouble with this patch.  It doesn't seem to respect any of my query
filters.

For example, I have a dismax query 
where q=the game
where qf = 'title_words~.06'

where my 'title_words' field is:
    <fieldType name="textExactWSTokenized" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
	<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
        <filter class="solr.ISOLatin1AccentFilterFactory"/>
	<filter class="solr.StandardFilterFactory"/>
	<filter class="solr.TrimFilterFactory" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>

I get this as the parsed query:
"parsedquery_toString"=>"+(((title_words:the~0.6)~0.01 (title_words:game~0.6)~0.01)~2)
()"
(I don't want it running anything on the word 'the' because its a stop word)

Yet if I change qf to just 'title_words' and remove the fuzziness, same query text, I get
this:
"parsedquery_toString"=>"+(((title_words:game)~0.01)~1) ()"
(which is what I want)


> Fuzzy search with DisMax request handler
> ----------------------------------------
>
>                 Key: SOLR-629
>                 URL: https://issues.apache.org/jira/browse/SOLR-629
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 1.3
>            Reporter: Guillaume Smet
>            Priority: Minor
>         Attachments: dismax_fuzzy_query_field.v0.1.diff, dismax_fuzzy_query_field.v0.1.diff
>
>
> The DisMax search handler doesn't support fuzzy queries which would be quite useful for
our usage of Solr and from what I've seen on the list, it's something several people would
like to have.
> Following this discussion http://markmail.org/message/tx6kqr7ga6ponefa#query:solr%20dismax%20fuzzy+page:1+mid:c4pciq6rlr4dwtgm+state:results
, I added the ability to add fuzzy query field in the qf parameter. I kept the patch as conservative
as possible.
> The syntax supported is: fieldOne^2.3 fieldTwo~0.3 fieldThree~0.2^-0.4 fieldFour as discussed
in the above thread.
> The recursive query aliasing should work even with fuzzy query fields using a very simple
rule: the aliased fields inherit the minSimilarity of their parent, combined with their own
one if they have one.
> Only the qf parameter support this syntax atm. I suppose we should make it usable in
pf too. Any opinion?
> Comments are very welcome, I'll spend the time needed to put this patch in good shape.
> Thanks.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message