lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <>
Subject ReadTask and its hierarchy needs some house cleaning
Date Tue, 18 May 2010 07:06:17 GMT

I wanted to run a benchmark .alg which will take a Filter into account.
However, ReadTask, which is the base for a variety of search related tasks,
does not support a Filter. When I reviewed the class, to understand how I
can easily add such Filter support, I discovered a whole set of classes
which IMO are completely unnecessary. ReadTask defines some with*() methods,
such as withSearch, withTraverse etc. and many classes override ReadTask
just to return true/false in those methods. WarmTask for example, returns
true in withWarm() and false otherwise, while SearchTask returns true in
withSearch and false otherwise.

This created a whole set of extensions that you either need to run in
sequence (e.g. Warm, SearchWithCollector) or create your own extension just
to get the right recipe for the operations to perform.

I suggest we do the following changes:
* Rename ReadTask to SearchTask -- that's because RT uses IndexSearcher,
QueryMaker -- all that suggests it's about Searching and not Reading. It's
only semantics, I know, but I think SearchTask is clearer than ReadTask
* Get rid of all the with*() methods, and instead move to use properties:
search.with.warm, search.with.traverse, search.with.collector etc.
* Introduce protected createCollector, createFilter, createSort, for custom
* Create a completely new hierarchy for this task, throwing away everything
that can be handled through properties only (like SearchTask, WarmTask etc.)

If we do this, then extensions of the new SearchTask will need to ask
themselves "do I want to search w/ a Collector/Filter/custom Sort?" and not
"do I Warm to be executed?" The core operation behind this task is The rest are just settings, or configuration, as well
as some added ops like warm, and traverse. If it makes sense, I can factor
warm() and traverse() into their own protected methods, for extensions to
override as well. It might make sense for warm because custom warms is
something I'm sure will be needed.

This will also allow running algorithms with rounds - different properties
for different rounds.

This approach does not prevent one from creating MySearchTask with
pre-defined and hard-coded settings. But for many others, the question of
which task to execute will go away - you execute SearchTask for the basic
search operations, or w/ the default Collector/Sort, and you control it via
properties. To create your own *SearchTask extension which hard-codes a
recipe, you'll need access to all the do<OP> members, so I'll make them
protected. But that's IMO is a rare requirement, than say running a search
with warm + traverse, and you shouldn't be forced to create a ReadTask
extension for that.

What do you think?


View raw message