From java-dev-return-13315-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Thu Apr 06 09:52:53 2006 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 74984 invoked from network); 6 Apr 2006 09:52:43 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 6 Apr 2006 09:52:43 -0000 Received: (qmail 41464 invoked by uid 500); 6 Apr 2006 09:52:37 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 41427 invoked by uid 500); 6 Apr 2006 09:52:36 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 41416 invoked by uid 99); 6 Apr 2006 09:52:36 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 06 Apr 2006 02:52:36 -0700 X-ASF-Spam-Status: No, hits=0.5 required=10.0 tests=DNS_FROM_RFC_ABUSE X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [217.12.10.220] (HELO web26009.mail.ukl.yahoo.com) (217.12.10.220) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 06 Apr 2006 02:52:35 -0700 Received: (qmail 13705 invoked by uid 60001); 6 Apr 2006 09:52:13 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=Message-ID:Received:Date:From:Subject:To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=JwkuXaQTdzNu7SwcE40p4Q5KcN7aYirQkemOtkR4m6pagDTdmDsur/Q/6yZQtiTROH1rTDqv1i2Uvzx6gMtKzjdmlC3mOV8ufQ5KPrmH76jMTPT3Zz98i/BTdOTyls0Qd/qtHJtyJZwCda7iZACPoyXkjea5JnUSo5uyG9hPHoQ= ; Message-ID: <20060406095213.13703.qmail@web26009.mail.ukl.yahoo.com> Received: from [193.36.230.96] by web26009.mail.ukl.yahoo.com via HTTP; Thu, 06 Apr 2006 10:52:13 BST Date: Thu, 6 Apr 2006 10:52:13 +0100 (BST) From: mark harwood Subject: Query.extractTerms - a poor introspection API? To: java-dev@lucene.apache.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Having switched the highlighter over from lots of Query-specific code to using the generic Query.extractTerms API I realize I have both gained something (support for all query types) and lost something (detailed boost info for each term in the tree eg Fuzzy spelling variants). The boost info was useful for selecting snippets and grading highlight intensity. This exercise has led me to the conclusion that extractTerms is not the greatest way to provide information about queries. I see a clear analogy with the way exceptions are/were implemented in Java - there used to be no standard way of unravelling nested exceptions and this was solved in JDK1.4 by adding a "getCause()" method to exceptions to allow progressive unravelling of all exception types. Unfortunately, Query.extractTerms(Set) is a bit like solving the Java nested exceptions problem by providing a method like Throwable.getMessageStrings(Set) - it only gives part of the information about the tree elements (ie no boosts info) and provides no indication of the nested structure. Maybe we should have as a standard part of Query: //immediate child queries only Query [] getNestedQueries(); and... //immediate terms only Term [] getTerms(); A generic highlighter implementation could then: a) work with any query type b) more accurately assess the score contribution each term provides based on it's position in the stack and the boosts applied to each parent query on that branch This doesn't seem a particularly onerous API to implement and a more feature-rich Query introspection API may well enable other applications such as Query optimizers. Cheers, Mark ___________________________________________________________ Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org