lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Engels <reng...@ix.netcom.com>
Subject Re: "Advanced" query language
Date Fri, 02 Dec 2005 23:03:18 GMT
I don't see the value in this. What ever is generating the xml could just as easily create/instantiate
the query objects.

I would much rather see the query parser migrated to an internal parser (that would be easier
to maintain), and develop a syntax that allowed easier use of the most common/powerful features.

-----Original Message-----
From: mark harwood <markharw00d@yahoo.co.uk>
Sent: Dec 2, 2005 10:03 AM
To: java-dev@lucene.apache.org
Subject: "Advanced" query language

There seems to be a growing gap between Lucene
functionality and the query language offered by
QueryParser (eg no support for regex queries, span
queries, "more like this", filter queries,
minNumShouldMatch etc etc).

Closing this gap is hard when:
a) The availability of Javacc+Lucene skills is a
bottleneck 
b) The syntax of the query language makes it difficult
to add new features eg rapidly running out of "special
characters"

I don't think extending the existing query
parser/language is necessarily useful and I see it
being used purely to support the classic "simple
search engine" syntax. 

Unfortunately the fall-back position for applications
which require more complex queries is to "just write
some Java code to instantiate the Query objects
programmatically." This is OK but I think there is
value in having an advanced search syntax capable of
supporting the latest Lucene features and expressed in
XML. It's worth considering why it's useful to have a
String-representable form for queries:
1) Queries can be stored eg in audit logs or "saved
queries" used for tasks like auto-categorization
2) Clients built in languages other than Java can
issue queries to a Lucene server
3) I can decouple a request from the code that
implements the query when distributing software e.g my
applet may not want Lucene dragging down to the client

Currently we cannot easily do the above for any
"complex" queries  because they are not easily
persisted (yes, we could serialize Query objects but
that seems messy and does not solve points 2 and 3).

We can potentially use XML in the same way ANT does
i.e. a declarative way of invoking an extensible list
of Java-implemented features. A query interpreter is
used to instantiate the configured Java Query objects
and populates them with settings from the XML in a
generic fashion (using reflection) eg:
....
   <MoreLikeThis minNumberShouldMatch="3"
maxQueryTerms="30">
      <text>
    Lorem ipsum dolor sit amet, consectetuer
adipiscing
    elit. Morbi eget ante blandit quam faucibus
posuere. Vivamus
    porta, elit fringilla venenatis consequat, neque
lectus
    gravida dolor, sed cursus nunc elit non lorem.
Nullam congue
    orci id eros. Nunc aliquet posuere enim.
      </text>
   </MoreLikeThis>
</BooleanClause>

Do people feel this would be a worthwhile endeavour?
I'm not sure if enough people feel pain around the
points 1-3 outlined above to make it worth pursuing.


Cheers
Mark



		
___________________________________________________________ 
How much free photo storage do you get? Store your holiday 
snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message