lucene-lucene-net-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "George Aroush (JIRA)" <j...@apache.org>
Subject [jira] Assigned: (LUCENENET-23) add c# version of FuzzyLikeThisQuery.java to contrib section
Date Thu, 08 Mar 2007 18:13:24 GMT

     [ https://issues.apache.org/jira/browse/LUCENENET-23?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

George Aroush reassigned LUCENENET-23:
--------------------------------------

    Assignee:     (was: George Aroush)

> add c# version of FuzzyLikeThisQuery.java to contrib section
> ------------------------------------------------------------
>
>                 Key: LUCENENET-23
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-23
>             Project: Lucene.Net
>          Issue Type: New Feature
>         Environment: n/a
>            Reporter: Marco Dissel
>         Attachments: FuzzyLikeThisQuery.cs
>
>
> I've converted the FuzzeLikeThisQuery.java to c#... Maybe George can add this to the
contrib section?
> original file is stored at :
> http://svn.apache.org/viewvc/lucene/java/trunk/contrib/queries/src/java/org/apache/lucene/search/FuzzyLikeThisQuery.java?revision=413732&view=markup
> Fuzzifies ALL terms provided as strings and then picks the best n differentiating terms.
> In effect this mixes the behaviour of FuzzyQuery and MoreLikeThis but with special consideration
of fuzzy scoring factors.
> This generally produces good results for queries where users may provide details in a
number of  fields and have no knowledge of boolean query syntax and also want a degree of
fuzzy matching and
> a fast query.
> For each source term the fuzzy variants are held in a BooleanQuery with no coord factor
(because we are not looking for matches on multiple variants in any one doc). Additionally,
a specialized
> TermQuery is used for variants and does not use that variant term's IDF because this
would favour rarer terms eg misspellings. Instead, all variants use the same IDF ranking (the
one for the source query  term) and this is factored into the variant's boost. If the source
query term does not exist in the index the average IDF of the variants is used. @author maharwood
> ps. there's no java test class...
> Thanks
> Marco

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message