lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lukai <>
Subject Re: Semi-structured queries
Date Fri, 07 Dec 2012 21:54:20 GMT
wrap your own parser.

eg. org/apache/lucene/querypasser/classic/QueryParser.jj.

On Fri, Dec 7, 2012 at 1:47 PM, Wu, Stephen T., Ph.D.

> I’ve been trying to do semi-structured queries & query parsing.  In other
> words, you could have XML snippets mixed in with plain terms, e.g. a query
> like:
>       christmas tree <store  loc=”abc” close_hour=”2200”>
> where you’re looking for a document with the terms “christmas” “tree” but
> also some structured data about where (practically) you could buy the tree.
>   Additionally, I’d like to be able to write functions relating multiple
> items, sort of like predicate logic or database-like queries:
>       christmas tree NEARBY( <store  close_hour=”2200”>, <restaurant
> close_hour=”2400”> )
> which would only find you places to buy a christmas tree that had stores
> and restaurants in close proximity to each other.  Finally, we would
> eventually be interested in doing something similar to
> org.apache.lucene.queries.CustomScoreQuery, where you can put in several
> different criteria and weight them separately per document.
> I’ve been poking around at a lot of places and would appreciate some help
> about where I should extend, an existing walkthough or example, etc.
>  Here’s what I’ve been considering:
>   *
> org/apache/lucene/queryparser/flexible/standard/ —
> modifying this to add another group-like QueryNode, modifying the processor
> pipeline to include this, modifying the definition of a TERM so it can deal
> with attribute=”value” pairs in pseudo-xml.  I read through the QueryParser
> documentation but quickly got lost in the implementation.
>   *   org/apache/lucene/queryparser/xml/ —
> this seems like it has to do a lot of what I want, but I can’t tell.  I
> hadn’t originally thought of the query coming in as an xml stream.  I think
> I would still need to define some new Query types... Perhaps a lot?  One
> for each type of thing (“store”, in the above) I’d search for?
> Thanks!
> stephen

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message