lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3130) Use BoostAttribute in in TokenFilters to denote Terms that QueryParser should give lower boosts
Date Wed, 31 Aug 2011 14:12:09 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094554#comment-13094554
] 

Robert Muir commented on LUCENE-3130:
-------------------------------------

Jan, not sure that most people only use single-term synonyms... if this is the case maybe
we should rethink our synonyms implementation because multi-word adds a ton of complexity!

Another reason I suggested avoiding adding this to the core queryparser is because its going
to be challenging to allow this optional boosting in a flexible way (just look at the getFieldQuery...
its very hairy). I think in the ideal case, we somehow restructure all this code so that subclasses
have more control over how the query is created... however I think this might be challenging
just given how the code is structured now.

The reason I think it would be best exposed as a 'hook' to subclasses (versus adding a "deboost
synonyms" option directly to the core QP), is that I think people are going to want to customize
how this works, e.g. control it per-field and things like that.

At the end of the day, a queryparser could always subclass getFieldQuery completely and do
this now, but thats not great either because the code is so hairy :(

This kind of feature might be easier to implement with the new queryparser in contrib, but
I'm not sure.

> Use BoostAttribute in in TokenFilters to denote Terms that QueryParser should give lower
boosts
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-3130
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3130
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Hoss Man
>
> A recent thread asked if there was anyway to use QueryTime synonyms such that matches
on the original term specified by the user would score higher then matches on the synonym.
 It occurred to me later that a float Attribute could be set by the SynonymFilter in such
situations, and QueryParser could use that float as a boost in the resulting Query.  IThis
would be fairly straightforward for the simple "synonyms => BooleamQuery" case, but we'd
have to decide how to handle the case of synonyms with multiple terms that produce MTPQ, possibly
just punt for now)
> Likewise, there may be other TokenFilters that "inject" artificial tokens at query time
where it also might make sense to have a reduced "boost" factor...
> * SynonymFilter
> * CommonGramsFilter
> * WordDelimiterFilter
> * etc...
> In all of these cases, the amount of the "boost" could me configured, and for back compact
could default to "1.0" (or null to not set a boost at all)
> Furthermore: if we add a new BoostAttrToPayloadAttrFilter that just copied the boost
attribute into the payload attribute, these same filters could give "penalizing" payloads
to terms when used at index time) could give "penalizing" payloads to terms.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message