lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <hossman_luc...@fucit.org>
Subject Re: "Advanced" query language
Date Fri, 23 Dec 2005 09:20:48 GMT
: > I think that the ideal API wouldn't require people
: > writing ObjectBuilders
: > to know anything about sax, or to ever need to
: > import anything from
: > org.xml.** or javax.xml.**
:
: Fair enough. I presume we want to maintain the
: position that Lucene should not have any dependencies
: other than JDK1.4?

As I understand it that's the current concensus -- but that's not really
my concern.  If lucene starts shipping with an xml library, or if this
parser gets put into contrib with a caveat that it only works if you have
some specified xml library is a policy/packaging issue .... i was more
worried about API for people who want to develop new types of queries --
and new converters for building those queries from XML.  I'm scared of
tying that API to a particular method of parsing XML that would make it
hard to change the underlying implimentation down the road.  (ie: having
all the methods throw SaxException would suck if 2 years from now we want
to re-impliment it using XPP)

In my twisted Utopian imagination - lucene core would ship with an
implimentation of the XML->Query parser/convertor API that was 100% pure
java1.4; but alternate implimentations (that used XPP or whatever the
flavor of the week was) would live in contrib for people who were willing
to trade the extra dependency for some trivial performance gain -- but the
same convertors (aka: ObjectBuilders) would work with either
implimentation.

: State "passed down" is something I saw as a potential
: addition to the "Parser" object shared by all
: ObjectBuilders eg a Map that was associated with
: stack level.

If you put the state in the parser, then I can't imagine any
implimentation could ever be thread safe.  I also can't really picture
what the API would be like without just making it a free for all of
esentially global variables -- ie: how does a parent ensure that state
info returned from one child doesn't polute the next child? (unless the
parent wants it too) ...

: Although the "occurs" info could be set in the child
: object as in your example that pushes some parsing
: responsibility down into child elements and I feel
: slightly uncomfortable about that as a technique. It

I wouldn't want to put any requirements on query builders just to support
being wrapped in a BooleanQuery like that -- i was just trying to
illustrate why i thought a mechanism that allowed for "decorating"
handlers with other handlers would be very usefull.  BooleanQuery is the
prime example to illustrate my goal.  While the default instance might
look something like...

      public interface LuceneXmlParser {
         public static LuceneXmlParser DEFAULT = ...
         static {
            DEFAULT.register("TermQuery",new TermQueryXmlBuilder());
            DEFAULT.register("PhraseQuery",new PhraseQueryXmlBuilder());
            ...
            DEFAULT.register("BooleanQuery", new BooleanQueryXmlBuilder());
            DEFAULT.register("Occurs", new OccursXmlBuilder());
            ...

...(and the term/phrase builders wouldn't know anything about
BooleanQueries) people who want a shorter syntax could do something
like...

     private LuceneXmlParser myParser = ...
     myParser.register("TermQuery",
                       new BooleanCLauseWrapperXmlBuilder
                       (new TermQueryXmlBuilder()));
     myParser.register("PhraseQuery",
                       new BooleanCLauseWrapperXmlBuilder
                       (new PhraseQueryXmlBuilder()));
     ...
     myParser..register("BooleanQuery", new BooleanQueryXmlBuilder());

...still reusing the orriginal builders for all of the various types of
queries, with their special decorator wrapped arround it.


: I'll spend some time studying your psuedo code in more
: detail later.

last night on the place, i realized Filters were a glarring omisison.
Since the "parent" allways needs to know what it expects back from it's
children, i think it makes sense for seperate handler interfaces, and for
the parser to have seperate methods for each (depending on what you are
expecting ... where by "you" i mean either you the person asking it to
parse a raw bit of xml, or you the person implimenting the a parent
handler whose going to have to know at compile time wether you expect
a child to be a filter or a query)

in other words, ammend what i sent out before, something like this...


public interface LuceneXmlParser {
    public void registerQueryHandler(String tag, LuceneXmlQueryHandler h);
    public void registerFilterHandler(String tag, LuceneXmlFilterHandler h);
    public Query parse(InputStream xml);
    public Filter parse(InputStream xml);
    public Filter processNode(LuceneXmlNode n, State s)
    public Query processNode(LuceneXmlNode n, State s)
}
public interface LuceneXmlHandler {
    public void setParser(LuceneXmlParser p)
}
public interface LuceneXmlQueryHandler extends LuceneXmlHandler {
    public Query process(LuceneXmlNode n, State s)
}
public interface LuceneXmlFilterHandler extends LuceneXmlHandler {
    public Filter process(LuceneXmlNode n, State s)
}






-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message