jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Kahwe Smith <...@pooteeweet.org>
Subject Re: Boosting newer documents
Date Tue, 06 Dec 2011 13:12:04 GMT

On Dec 6, 2011, at 13:58 , Jukka Zitting wrote:

> Hi,
> 
> 
> On Tue, Dec 6, 2011 at 10:27 AM, Christian Stocker
> <christian.stocker@liip.ch> wrote:
>> Before we dig more into this: Would this be the correct way? Is this
>> even possible in Jackrabbit without having to change too much? Or is
>> there an easier way to give "newer" Documents more weight than
>> older once?
> 
> I guess you could achieve the same effect by having a custom indexing
> configuration file [1] that you modify once per day or per week to
> increase the boost for specific node types.
> 
> However, it seems counterintuitive to have to keep increasing the
> boost either with configuration changes or with a boost function like
> the one you proposed.

yes .. this needs to be a boost that is determined at query time and not at index time

> As already suggested by Alex, I'd rather use sorting for this. To
> allow the full text match score to affect the sort order you could use
> just the year, month or week number as the first sort term and let the
> matches within that time period be sorted according to the match
> score.

well sorting by date isnt a good solution at all if you actually also care about the score.
this is basically reducing lucene to an RDBMS LIKE engine.

> Solr has a more complex mechanism for such date-based scoring (see
> [2]), but making something like that work with Jackrabbit probably
> needs quite a bit of work on the search index layer.
> 
> [1] http://wiki.apache.org/jackrabbit/IndexingConfiguration
> [2] http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents


yes .. maybe its not feasible "just" for this. but imho this is currently the biggest gap
with Jackrabbit that it doesnt expose more of Lucene (for example facetting etc). it would
be kind of a pitty to have to duplicate the entire data inside another Solr/ElasticSearch
instance that would be then totally unaware of ACL's and path relations

regards,
Lukas Kahwe Smith
mls@pooteeweet.org




Mime
View raw message