lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From markharw00d <markharw...@yahoo.co.uk>
Subject Re: XML based Query Parser
Date Mon, 27 Feb 2006 08:24:12 GMT
Hi Chris,
Thanks for taking the time to look at this in detail.


>1) The factory classes should have "removeBuilder" methods so people
>   subclassing parsers can flat out remove support for a particular
>   tag, not just replace it.
>  
>
Can do.

>2) This DOM version definitely seems easier to follow/use then the SAX
>   version (allthough thta could just be because i'm more familiar
>   with DOM then SAX) .. but it still seems like an intermediate
>   representation that wasn't org.w3c.dom.Element would be handy -- if
>   for no other reason then so the methods in DOMUtils could be put
>   directly in the "Element" class.
>  
>
But doesn't sticking with w3c.dom.Element allow the possibility of 
standards based tools (eg XPath implementations) to be used by builders 
if they so wish?

>3) I'm still confused about how state information could/would be
>   passed up or down the tree by builders.  you mentioned something
>   before about using ThreadLocal (which i'm not very familiar with)
>   but i don't see any examples of that here.
>  
>
I ripped the ThreadLocal thing out and opted for 
DOMUtils.getAttributeWithInheritance instead. My one scenario I came 
across where I wanted some context passed down was "fieldName" and this 
is handled simply by leaf nodes walking up the w3c.dom.Node tree until 
you find an Element with this attribute set.

>4) The various getQuery (and getFilter and getSpanQuery) methods
>   should use e.getTagName when throwing ParserExceptions about not
>   finding what they're looking for,  That way, if i write a
>   parser that has...
>      queryFactory.addBuilder("tf",new TermQueryObjectBuilder());
>   ... the ParserExceptions thrown by TermQueryObjectBuilder.getQuery
>   when i forget to include a value="foo" can refrence the tag name
>   i'm using (tf), and not the hardcoded constant "TermQuery"
>  
>
Agreed. In addition, all nested tag names (eg BooleanQuery's  "Clause") 
should be configured as properties that can be changed if so desired. 
The addBuilder call shown above should probably be changed to just take 
a Builder object and it should get the root tag name from the 
Builder.getTagName you propose.


>5) it seems like a lot of redundency could be removed between the
>   various builders by refactoring some more utility functions --
>   particularly in the error cases  (see attached patch)
>  
>

Thanks, this tidies things up nicely.

I am still of the opinion that a self-documenting parser configuration 
("what XML can I write: is it <tf> or <TermQuery>?") is important, 
otherwise users have to examine parser source code to determine it's 
capabilities. If we make the step of allowing the parser configuration 
to provide metadata about what is required/optional etc we can not only 
produce documentation but also drive a query-building GUI.

Cheers
Mark


		
___________________________________________________________ 
Win a BlackBerry device from O2 with Yahoo!. Enter now. http://www.yahoo.co.uk/blackberry

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message