Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 77086 invoked from network); 22 Dec 2005 00:20:48 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 22 Dec 2005 00:20:48 -0000 Received: (qmail 65384 invoked by uid 500); 22 Dec 2005 00:20:45 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 65352 invoked by uid 500); 22 Dec 2005 00:20:44 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 65341 invoked by uid 99); 22 Dec 2005 00:20:44 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Dec 2005 16:20:44 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received-SPF: neutral (asf.osuosl.org: local policy) Received: from [169.229.70.167] (HELO rescomp.berkeley.edu) (169.229.70.167) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 21 Dec 2005 16:20:43 -0800 Received: by rescomp.berkeley.edu (Postfix, from userid 1007) id 75E9E5B77F; Wed, 21 Dec 2005 16:20:19 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by rescomp.berkeley.edu (Postfix) with ESMTP id 7333E7F459 for ; Wed, 21 Dec 2005 16:20:19 -0800 (PST) Date: Wed, 21 Dec 2005 16:20:19 -0800 (PST) From: Chris Hostetter Sender: hossman@hal.rescomp.berkeley.edu To: java-dev@lucene.apache.org Subject: Re: "Advanced" query language In-Reply-To: <20051216075159.30455.qmail@web26003.mail.ukl.yahoo.com> Message-ID: References: <20051216075159.30455.qmail@web26003.mail.ukl.yahoo.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N I finally got a chance to look at this code today (the best part about the last day before vacation, is no one expects you to get anything done, so you can ignore your "real work" and spend time on things that are more important in the long run) and while I still havne't wrapped my head arround all of it, I wanted to share my thoughts so far on the API... 1) I aplaud the plugable nature of your solution. Looking at the Test Case, it is easy to see exactly how a service provider could do things like override the behavior of a to be implimented as a SpanQuery without their clients being affected at all. Kudos. 2) Digging into what was involved in writting an ObjectBuilder, I found the api somewhat confusion. I was reminded of this exchange you had with Yonik... : > While SAX is fast, I've found callback interfaces : > more difficult to : > deal with while generating nested object graphs... : > it normally : > requires one to maintain state in stack(s). : : I've gone to some trouble to avoid the effects of this : on the programming model. As someone who feels very comfortable with Lucene, but has no practical experience with SAX, I have to say that I don't really feel like the API has a very clean seperation from SAX. I think that the ideal API wouldn't require people writing ObjectBuilders to know anything about sax, or to ever need to import anything from org.xml.** or javax.xml.** 3) While the *need* to maintaing/pass state information should be avoided. I can definitely think of uses for this framework that may *want* to pass state information -- both down to the ObjectBuilders that get used in inner nodes, as well as up to wrapping nodes, and there doesn't seem to be an easy way to that. (it could just be my lack of SAX knowledge though) The best example i can give is if someone (ie: me) wanted to use this framework to allow boolean queries to be written like this... "a phrase" fuzzy~ ... How Now Brown Cow? ... I haven't had a chance to try implimenting this, but at a high level, it seems like all of this should be possible and still easy to use. Here's a real rough cut at what i've had floating arround in the back of my head (I'm doing this straight into email, pardon any typo's or psuedo code) ... /** could be implimented with SAX, or DOM, or Pull */ public interface LuceneXmlParser { /** this method will call setParser(this) on each handler */ public void registerHandler(String tag, LuceneXmlHandler h); /** primary method for clients, parses the xml and calls processNode on the root node */ public Query parse(InputStream xml); /** dispatches to the appropriate handler's process method based on the Node name, may be called by handlers for recursion of children nodes */ public Query processNode(LuceneXmlNode n, State s) } public interface LuceneXmlHandler { public void setParser(LuceneXmlParser p) /** should return a Query that corrisponds to the specified node. may rea/modify state in any way it wants ... it is recommended that all implimenting methods wrap their state before passing it on when processing children. */ public Query process(LuceneXmlNode n, State s) } /** A State is a stack frame that can delegate read operations to another State it wraps (if there is one). but it cannot delegate modifying operations. Classes implimenting State should provide a constructor that takes another State to wrap. */ public interface State extends Map { /** for callers that wnat to know what's in the immeidate stack frame without any delegation */ public Map getOuterFrame(); /* should return a new state that wraps the current state */ public State wrapCurrentState(); } /** a very simple api arround the most basic xml concepts */ public interface LuceneXmlNode { public CharSequence getNodeName(); public Map getAttributes() public CharSequence getBodyText(); public Iterator getChildren() } /** an example handler for TermQuery */ public class BooleanQueryHandler impliments LuceneXmlHandler { LuceneXmlParser p; public void setParser(LuceneXmlParser q) { p=q; } public Query process(LuceneXmlNode n, State s) { Map attrs = getAttributes() return new TermQuery(new Term(attrs.get("field"),attrs.get("value")) } } /** an example handler for BooleanQuery */ public class BooleanQueryHandler impliments LuceneXmlHandler { LuceneXmlParser p; public void setParser(LuceneXmlParser q) { p=q; } public Query process(LuceneXmlNode n, State s) { BooleanQuery r = new BooleanQuery; Integer minShouldMatch = new Integer(n.getAttributes().get("minShouldMatch")); r.setMinShouldMatch(minShouldMatch); for (LuceneXmlNode kid : n.getChildren()) { kidState = s.wrapCurrentState(); Query b = p.processNode(kid,kidState); Occurs o = Occurs.MAY; if (kidState.getOuterFrame().contains("occurs")) { o = kidState.getOuterFrame().get(); } r.add(b,o); } return r; } /** an example handler that can make wrap any other handler and give it BooleanClause.Occurs awareness */ public class BooleanClauseWrapperHandler impliments LuceneXmlHandler { LuceneXmlParser p; LuceneXmlHandler inner; public BooleanClauseWrapperHandler(LuceneXmlHandler i) { inner = i; } public void setParser(LuceneXmlParser q) { p=q; } public Query process(LuceneXmlNode n, State s) { Query q = i.process(n, s) if (n.getAttributes().contains("occurs")) { /* glossing over string parsing to object construction here */ s.put("occurs",n.getAttributes().get("occurs")); } return q; } } ...does that make sense? -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org