lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-8326) More Like This Params Refactor
Date Thu, 24 May 2018 03:00:00 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-8326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16488345#comment-16488345
] 

Robert Muir commented on LUCENE-8326:
-------------------------------------

First looking at the API change, it would be good to understand the goals. 

This change wraps 8 or 9 existing setters such as {{setMinTermLen}} with a "configuration
class". There is also another class related to boosts. But everything is still just as mutable
as before, so from my perspective it only adds additional indirection/abstraction which is
undesired.

If we want to make MLT immutable or something like that, we should first figure out if that's
worth it. From my perspective, I'm not sold on this for MoreLikeThis itself, since its lightweight
and stateless, and since I can't see a way for MoreLikeThisQuery to cache efficiently.

On the other hand MoreLikeThisQuery is kind of a mess, but that isn't addressed with the refactoring.
Really all queries should be immutable for caching purposes, and should all have correct equals/hashcode:
but seems like its a lost cause with MoreLikeThisQuery since it does strange stuff in rewrite:
its not really a per-segment thing. Because of how the query works, its not obvious to me
if/how we can improve it with immutability...

Also currently MoreLikeThisQuery doesn't accept MoreLikeThis as a parameter or anything, and
only uses it internally. So as it stands (also with this patch) it still has a "duplicate"
API of all the parameters, which isn't great.

So I think if we want to change the API for this stuff, we should figure out what the goals
are? If its just to say, consolidate api between MoreLikeThis and MoreLikeThisQuery, I can
buy into that (although I have never used the latter myself, only the former). However the
other queries use builders for such purposes so that's probably something to consider.

For the solr changes, my only comment would be that instead of running actual queries, isn't
it good enough to just test that XYZ configuration produces a correct MLT object? Otherwise
the test seems fragile from my perspective.

> More Like This Params Refactor
> ------------------------------
>
>                 Key: LUCENE-8326
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8326
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alessandro Benedetti
>            Priority: Major
>         Attachments: LUCENE-8326.patch, LUCENE-8326.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> More Like This ca be refactored to improve the code readability, test coverage and maintenance.
> Scope of this Jira issue is to start the More Like This refactor from the More Like This
Params.
> This Jira will not improve the current More Like This but just keep the same functionality
with a refactored code.
> Other Jira issues will follow improving the overall code readability, test coverage and
maintenance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message