lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "none none" <kor...@lycos.com>
Subject Re: Query Term Collector (was: Re: New highlighter package available)
Date Sun, 05 Oct 2003 16:05:09 GMT
ok Mark,
i will run a couple of test with my way of collecting terms and yours, if i don't see reasonable
improvement i will use yours, otherwise i'll keep changing the code as i have been doing for
more than a year, my search engine needs very high performance, i also don't reuse query object,
my index is updated very often, so, in my case if i can speedup/save resource i am able to
pay the price of break these requirements simply because i don't need them.
Thank you,
Ciao Korfut.  
--

--------- Original Message ---------

DATE: Sun, 5 Oct 2003 09:15:14 
From: markharw00d@yahoo.co.uk
To: lucene-dev@jakarta.apache.org
Cc: 

>Here are some very important reasons why getTerms() shouldn't be added as a method to
Query:
>
>Query objects are seen by Lucene users as reusable objects.
>
>Eg they could be used as routing queries which are run repeatedly to classify incoming
documents.
>
>They are are re-usable across multiple indexes and index versions ie they hold no state
about 
>specific indexes. Thats the current contract.
>
>If you decided to slap a method called getTerms() on a query which returns expansions
of multi-terms 
>that is adding state which effectively ties the Query instance to a particular index and
a particular 
>snapshot of that index's content, rendering the query unreusable.
>
>It is useful to think of Queries in two forms:
>
>1) High-level, reusable, index-and index-version independent objects (returned by QueryParser)
>2) Targetted queries associated with a particular version of an index, used briefly then
discarded.
>
>Now. Type 2 ("targetted") is the query returned by query.rewrite(reader) and was until
recently used
>exclusively by the search process and subsequently thrown away.
>
>The new highlighting code also requires the use of "targetted queries" but it is not possible
to get
>hold of the targetted query that is the by-product of the search. This is why the caller
is expected 
>to create a "targetted" query by calling rewrite THEN calling the search and highlight
functions with 
>this version.
>
>
>These query types are important distinctions to preserve and the getTerms() proposal 
>doesn't respect these subtle differences in query usage.
>
>
>Cheers
>Mark
>
>
>
>PS
>>>I looked at your code quickly, can you confirm that the following scenario is
what 
>>>happens when you run a search with MultiTermQuery? 
>
>Not true any more. I think you're looking at outdated code.
>See my recent post which described how I ripped out the rewrite calls in the latest highlighter
and made
>it the caller's responsibility:
>http://marc.theaimsgroup.com/?l=lucene-dev&m=106507977317157&w=2
>As for "prohibited" - note the highlighter takes a "prohibited" parameter too.
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>
>



____________________________________________________________
Get advanced SPAM filtering on Webmail or POP Mail ... Get Lycos Mail!
http://login.mail.lycos.com/r/referral?aid=27005

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message