Return-Path: Delivered-To: apmail-lucene-dev-archive@www.apache.org Received: (qmail 53333 invoked from network); 18 May 2010 07:06:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 18 May 2010 07:06:51 -0000 Received: (qmail 8388 invoked by uid 500); 18 May 2010 07:06:50 -0000 Delivered-To: apmail-lucene-dev-archive@lucene.apache.org Received: (qmail 8191 invoked by uid 500); 18 May 2010 07:06:47 -0000 Mailing-List: contact dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@lucene.apache.org Delivered-To: mailing list dev@lucene.apache.org Received: (qmail 8184 invoked by uid 99); 18 May 2010 07:06:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 07:06:46 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of serera@gmail.com designates 74.125.83.48 as permitted sender) Received: from [74.125.83.48] (HELO mail-gw0-f48.google.com) (74.125.83.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 May 2010 07:06:38 +0000 Received: by gwj16 with SMTP id 16so446225gwj.35 for ; Tue, 18 May 2010 00:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=8bARgxNxzeV+AA6+2uYPyFK0Swr0d5jNeVgWNc4avXw=; b=aWgh9PHrzsSR2G61AIjADf+wKj5KNWOvZvbLtjuzl8yEepQsVkVsNCiJe48ORSknqZ 0oU9ni+49lrJSpPMG0y6mCwHcCYPJ8gCJ61Zz9Javs6C/a4AazzDmIgpyJYFYKcCTPgh yHs5+YFke/N4I5D90kLJlGdMYv0S6IspIdwH4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=tyLFKMYz/6zd82RNsf1YwAa6/Wy3t4SfRkRlhdoeLvjEwp1Jt/kLW1D1+cwhozHWe4 TQDNpLtY7KdIKf/h5uSzh1m93bN7T2C1PO19UM+lHtO5BLI5fPPsvEBMglVeVud7DJs7 7WD0uMhIocX99PEEUDR8xVPjwKsmUfOsv0yIU= MIME-Version: 1.0 Received: by 10.101.10.36 with SMTP id n36mr7355649ani.114.1274166377246; Tue, 18 May 2010 00:06:17 -0700 (PDT) Received: by 10.100.255.11 with HTTP; Tue, 18 May 2010 00:06:17 -0700 (PDT) Date: Tue, 18 May 2010 10:06:17 +0300 Message-ID: Subject: ReadTask and its hierarchy needs some house cleaning From: Shai Erera To: dev@lucene.apache.org Content-Type: multipart/alternative; boundary=0016e68fd0644cf09f0486d8fa4f X-Virus-Checked: Checked by ClamAV on apache.org --0016e68fd0644cf09f0486d8fa4f Content-Type: text/plain; charset=ISO-8859-1 Hi I wanted to run a benchmark .alg which will take a Filter into account. However, ReadTask, which is the base for a variety of search related tasks, does not support a Filter. When I reviewed the class, to understand how I can easily add such Filter support, I discovered a whole set of classes which IMO are completely unnecessary. ReadTask defines some with*() methods, such as withSearch, withTraverse etc. and many classes override ReadTask just to return true/false in those methods. WarmTask for example, returns true in withWarm() and false otherwise, while SearchTask returns true in withSearch and false otherwise. This created a whole set of extensions that you either need to run in sequence (e.g. Warm, SearchWithCollector) or create your own extension just to get the right recipe for the operations to perform. I suggest we do the following changes: * Rename ReadTask to SearchTask -- that's because RT uses IndexSearcher, QueryMaker -- all that suggests it's about Searching and not Reading. It's only semantics, I know, but I think SearchTask is clearer than ReadTask * Get rid of all the with*() methods, and instead move to use properties: search.with.warm, search.with.traverse, search.with.collector etc. * Introduce protected createCollector, createFilter, createSort, for custom extensions * Create a completely new hierarchy for this task, throwing away everything that can be handled through properties only (like SearchTask, WarmTask etc.) If we do this, then extensions of the new SearchTask will need to ask themselves "do I want to search w/ a Collector/Filter/custom Sort?" and not "do I Warm to be executed?" The core operation behind this task is IndexSearcher.search. The rest are just settings, or configuration, as well as some added ops like warm, and traverse. If it makes sense, I can factor warm() and traverse() into their own protected methods, for extensions to override as well. It might make sense for warm because custom warms is something I'm sure will be needed. This will also allow running algorithms with rounds - different properties for different rounds. This approach does not prevent one from creating MySearchTask with pre-defined and hard-coded settings. But for many others, the question of which task to execute will go away - you execute SearchTask for the basic search operations, or w/ the default Collector/Sort, and you control it via properties. To create your own *SearchTask extension which hard-codes a recipe, you'll need access to all the do members, so I'll make them protected. But that's IMO is a rare requirement, than say running a search with warm + traverse, and you shouldn't be forced to create a ReadTask extension for that. What do you think? Shai --0016e68fd0644cf09f0486d8fa4f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi

I wanted to run a benchmark .alg which will take= a Filter into account. However, ReadTask, which is the base for a variety = of search related tasks, does not support a Filter. When I reviewed the cla= ss, to understand how I can easily add such Filter support, I discovered a = whole set of classes which IMO are completely unnecessary. ReadTask defines= some with*() methods, such as withSearch, withTraverse etc. and many class= es override ReadTask just to return true/false in those methods. WarmTask f= or example, returns true in withWarm() and false otherwise, while SearchTas= k returns true in withSearch and false otherwise.

This created a whole set of extensions that you either need to run in s= equence (e.g. Warm, SearchWithCollector) or create your own extension just = to get the right recipe for the operations to perform.

I suggest we = do the following changes:
* Rename ReadTask to SearchTask -- that's because RT uses IndexSearcher= , QueryMaker -- all that suggests it's about Searching and not Reading.= It's only semantics, I know, but I think SearchTask is clearer than Re= adTask
* Get rid of all the with*() methods, and instead move to use properties: s= earch.with.warm, search.with.traverse, search.with.collector etc.
* Intr= oduce protected createCollector, createFilter, createSort, for custom exten= sions
* Create a completely new hierarchy for this task, throwing away everything= that can be handled through properties only (like SearchTask, WarmTask etc= .)

If we do this, then extensions of the new SearchTask will need to= ask themselves "do I want to search w/ a Collector/Filter/custom Sort= ?" and not "do I Warm to be executed?" The core operation be= hind this task is IndexSearcher.search. The rest are just settings, or conf= iguration, as well as some added ops like warm, and traverse. If it makes s= ense, I can factor warm() and traverse() into their own protected methods, = for extensions to override as well. It might make sense for warm because cu= stom warms is something I'm sure will be needed.

This will also allow running algorithms with rounds - different propert= ies for different rounds.

This approach does not prevent one from cr= eating MySearchTask with pre-defined and hard-coded settings. But for many = others, the question of which task to execute will go away - you execute Se= archTask for the basic search operations, or w/ the default Collector/Sort,= and you control it via properties. To create your own *SearchTask extensio= n which hard-codes a recipe, you'll need access to all the do<OP>= members, so I'll make them protected. But that's IMO is a rare req= uirement, than say running a search with warm + traverse, and you shouldn&#= 39;t be forced to create a ReadTask extension for that.

What do you think?

Shai
--0016e68fd0644cf09f0486d8fa4f--