Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@apache.org Received: (qmail 75362 invoked from network); 18 Mar 2003 22:54:48 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 18 Mar 2003 22:54:48 -0000 Received: (qmail 14990 invoked by uid 97); 18 Mar 2003 22:56:37 -0000 Delivered-To: qmlist-jakarta-archive-lucene-dev@nagoya.betaversion.org Received: (qmail 14983 invoked from network); 18 Mar 2003 22:56:37 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 18 Mar 2003 22:56:37 -0000 Received: (qmail 75095 invoked by uid 500); 18 Mar 2003 22:54:46 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 75084 invoked from network); 18 Mar 2003 22:54:45 -0000 Received: from www2.mail.lycos.com (HELO mailcity.com) (209.202.220.150) by daedalus.apache.org with SMTP; 18 Mar 2003 22:54:45 -0000 Received: from Unknown/Local ([?.?.?.?]) by mailcity.com; Tue, 18 Mar 2003 22:54:41 -0000 To: lucene-dev@jakarta.apache.org Date: Tue, 18 Mar 2003 14:54:41 -0800 From: "none none" Message-ID: Mime-Version: 1.0 X-Sent-Mail: off Reply-To: korfut@lycos.com X-Mailer: MailCity Service X-Priority: 3 Subject: Re: Iterators for collecting Terms from Queries X-Sender-Ip: 64.187.36.2 Organization: Lycos Mail (http://www.mail.lycos.com:80) Content-Type: text/plain; charset=us-ascii Content-Language: en Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N >(this is just a minor implementation suggestion) >I think perhaps this flag could be passed to Query when executing query, not >stored in Query object? This because it's not really a property of Query >object but property of execution of seach (whether to keep track of Terms so >they can be requested from Query, or returned along with Search results). >This would require changes to Query classes however. > Smart! i didn't really think where to put it but i tought would be good avoid that because many users do not need the highlight aka termCollector, so why force them? In your solution as i said more elegant, the user has to decide to do so, that mean in my case set the varable to true. Good idea Tatu!! >One problem I tried to solve was that user shouldn't have to know structure of >Query classes (that's what visitor pattern in general solves), while still >allowing access to some useful properties, such as optional/reqd/prohibited >flag that's only available in BooleanClause, not in queries (iterator keeps >track of those flags and allows them to be accessed as if they were >properties of queries themselves). > >Note however that your method could be changed to do similar recursive >traversal (if it doesn't already do that, I may have misunderstood your >explanation?) for simple cases, so that caller wouldn't have to know the >structure, if it only needs terms, not context (ie. need not know which Term >came from which query; sometimes this is needed, esp. with phrase queries). > Yes, i didn't explain but what i actully do in my HighLighter class is kind like your TermCollector, i put all the terms together. Please Note that i add extra information when i collect them, i put the "slop" for example, that is because of my Highlight implementation i need to know its value. Let's say i do something more then just collect in this class. >Like I said above, while you are right that it does have overhead (computing >terms twice), I'm not sure how significant that would be in general, compared >to search, scoring etc. >It would be good to do some simple tests to see if I'm wrong here and Term >collection is actually big part of execution time. > I believe, and as you said we could run a test, in WildCard or Prefix query this will make a markable difference. >One other thing I was thinking about was refactoring Range and Prefix queries >to be MultiTermQuery - based. I think that should benefit both solutions. I totally agree with you, also i believe everything can be BooleanQuery and MultiTermQuery, TermQuery would be a MultiTermQuery with one term in the array, for instance. >Plus, it seems to me that PhrasePrefixQuery perhaps should just be rewritten. >It acts very different from other queries, requiring caller to expand terms >when it's being built. It seems like it perhaps should work more like plain >PrefixQuery, and do expansion only when being executed. Otherwise one >has to build new Query for each search execution, if index has changed. I don't really use PhrasePrefixQuery, also because it is not supported by the QueryParser, you have to create it, so for now i just avoid to use it. Thank you, Ciao. _____________________________________________________________ Get 25MB, POP3, Spam Filtering with LYCOS MAIL PLUS for $19.95/year. http://login.mail.lycos.com/brandPage.shtml?pageId=plus&ref=lmtplus --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org