lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller" <markrmil...@gmail.com>
Subject Re: Proximity searching with subexpressions
Date Wed, 09 May 2007 16:04:04 GMT
The ~ syntax can only be applied to a single phrase, i.e. "the greate
one"~3. This sets the slop allowed for the phrase to be 3. The defintion of
slop can be found by searching the archive, but it will not easily allow for
what you are looking for. The Lucene query language has not yet recived Span
systax support. You might want to check out Surround in contrib which is an
alternate syntax with support for Spans. I also have a parser that adds
proximity support...I am currently sitting on version 1.0 waiting for some
free time to release. My syntax consists of binary operators including a
proximity operator. Precedence works as expected (being a binary op sytax)
and a search for (alien ufo crash site ) ~10 (Roswell NM New Mexico ) would
make sure that each term on the left is within 10 of each term on the right
(if you default space operator was AND).

-Mark


On 5/9/07, Walt Stoneburner <walt.stoneburner@gmail.com> wrote:
>
> According to the documentation for Lucene's Query Parser Syntax, the
> tilde operator provides a proximity search.
>
> For instance, "Harry Hallows"~6 should match the text 'Harry Potter
> and the Deathly Hallows'.
>
> And while this is fine for two token phrases, I was wondering if it
> worked well for subexpressions:
>
> ( ( "alien" "ufo" "crash site" ) ( "Roswell" "NM" "New Mexico" ) ) ~10
>
> Will this match "alien ship crashed in Roswell" in addition to "ufo
> spotted hovering over New Mexico"?
>
> I've been searching through the Lucene groups
> (http://www.gossamer-threads.com/lists/lucene/java-user/) for span and
> proximity, but have come up with nothing about such syntax or
> capability.
>
> Additionally, I tried entering expressions containing a tilde into
> Luke and then doing an explain structure.
>
> Surprisingly, I didn't see a difference between a simple case like "A
> B" and "A B"~10 in the explanation.
>
> Is this a problem with Luke, or did I possibly miss something trivial?
>
> Anyhow, the more important question is: Is it possible to do proximity
> searches on lists of terms?
>
> Thanks,
> -Walt Stoneburner, <wls@wwco.com>
> http://www.wwco.com/~wls/blog/
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message