lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <>
Subject Re: Lucene and queries
Date Sun, 21 Jan 2007 15:14:20 GMT
My question is "what are you trying to accomplish"? The reason I ask is that
all three queries pre-suppose that the search you're performing is on a very
precisely defined fields. (1) supposes a field where the A is exactly three
words from the end. (3) supposes the A is exactly three words from the
beginning. (2) must be a field that has the A exactly 5 words from the end
and B exactly two words from the end. This seems kind of an odd requirement
and doesn't map into problem spaces I'm familiar with. I wonder if there are
parts of your problem you're leaving out, for instance "in a sentence" or
some such....

I'm pretty sure the information you need to create a custom HitCollector is
in the index, and there are a bunch of indexing tricks you can play to make
this easier, but I suspect the contributors to this list will have much more
helpful answers for you if you add a bit of detail....

There's no syntax I know of that'll give you this kind of query out of the
box. The closest thing would be span queries, which will give you things
like A**B, meaning "give me all documents where A is NOT MORE THAN 2 words
away from B. This is not what you're asking for, since it would also return
A*B and AB though...


On 1/21/07, david chris <> wrote:
> Hi,
> I am wondering if Lucene can handle the following queries:
> (1) A * *
> give me all documents with word A followed by exactly two words
> (2) A * * B *
> give me all documents with words A and B exactly separated by 2 words and
> word B followed by one word
> (3) * * A
> give me all documents with word A prefixed by exactly two words
> Thanks.
> David.
> ---------------------------------
> Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions
> ! Profitez des connaissances, des opinions et des expériences des
> internautes sur Yahoo! Questions/Réponses.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message