lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: "Advanced" query language
Date Sun, 18 Dec 2005 01:43:06 GMT
Paul and  Wolfang,

Thank you very much for your input. I think there are two distinct problems that have emerged
from this thread:
1) The ability to create efficient structures to index and query XML documents (element, attributes
and corresponding values) with a full-text query language and perforators. After all XML is
text. As Paul pointed out people have already tried this with Lucene.
2) The need for a standard query language like XQuery aiming at system interoperability in
the now XMLized world that has the same effect that SQL had in the relational world. 

While I can see how in the SQL case extension functions can be used to implement full-text
capabilities, in the XML case full-text is required to query and retrieve XML (sub-document)
elements and attributes  based on the free text (natural language) values AND also to query
the strings that represent the structure itself. For example, in simple SQL queries the names
of the tables and columns need to be known to project corresponding values and are not part
of the search conditions (in WHERE clauses only values corresponding to table/columns are

In XQuery both the structure and the content are searchable, thus requiring full-text operators.
That is why XQuery Full-Text requires the unification and standardization both XQuery and
Full-Text "languages". Needless is to say that the implementation will differ from system
to system.

I do agree though that the abstraction of full-text capabilities through functional extensions
is a great first step. Check out Oracle's XML Query Service (
and ,  a Java based XQuery
engine that has abstracted "data sources"  such as Web Services, RDBMS, etc. as functions
that while returning XML can receive parameters and supply full-text capabilities. If Mark's
implementation of Lucene query and output in XML comes to fruition a Lucene data source will
become yet another stream of XML that can be queried, processed and rendered by the mid-tier
XQuery engine.

-- Joaquin

View raw message