lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pier Fumagalli <p...@betaversion.org>
Subject Re: XML search language...
Date Mon, 09 Dec 2002 22:59:24 GMT
On 8/12/02 17:13 "Mark R. Diggory" <mdiggory@latte.harvard.edu> wrote:

> Your not providing enough info about what your trying to do, thus you
> get generalized responses to basic messaging technologies available.

Yeah, sorry Mark, but I'm overwhelmed by work ATM, and this project will
start early next january... So, I can only dedicate few minutes a day...

Couple of days ago, I flagged a message in which you said:

> <?xml version="1.0" encoding="UTF-8"?>
> <query>
>  <boolean type="and">
>    <term field="character">Bird</term>
>    <group>
>      <term field="category">Cartoon</term>
>      <boolean type="not">
>        <term field="name">Roadrunner</term>
>      </boolean>
>    </group>
>  </boolean>
> </query>

That _is_ a beauty.. Plain, simple, and doing exactly what I need! :-)

No weird XML-RPC/SOAP/QUERY stuff, just one tiny little thing that does the
job I require... :-)

Only thing I don't "like" is how you group up the terms, for example, I
don't quite get the distinction between "boolean / and" and group...

In theory, a binary operation can always be reducible to its minimal
configuration of two terms, depending on what precedence we give to the (for
instance) "and" "or" and "not" operations... So, I don't see why group is
actually there! :-)

And also, one other thing is that since we have the flexibility of XML, why
not using specific tags, such as  <and/> or <or/> and <not/>...

That is because, if you process SAX events, you can easily trigger on those
names which are unique in your tag, while if you do use attributes, well,
the whole thing gets a little bit messed up in terms of parsing/checking and
slower because you have to analyze every single attribute to get the "type"
of your boolean operation....

I'm thinking about something like:

<?xml version="1.0"?>
<query index="Articles">
  <and>
    <term field="subject">Microsoft</term>
    <or>
       <term>Lawsuit</term>
       <term>Court</term>
    </or>
  </and>
</query>

Does it make sense????

    Pier


--
To unsubscribe, e-mail:   <mailto:lucene-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:lucene-dev-help@jakarta.apache.org>


Mime
View raw message