lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Walt <>
Subject teragram to Lucene
Date Fri, 15 Jul 2011 15:00:47 GMT
    I am responsible for moving a Teragram application to Lucene. I have
identified the following issues so I would like verification that what
the existing rules have do not exist in Lucene or there is a work-around.

1) Teragram uses a Polish Notation for its rules, i.e.
Note: DIST_2 is a proximity of within 2 words

DIST_2     (OR (phrase1, phrase 2, .... phrase 20)
                (OR(phrasea, phraseb, phrase c)

This says "if any phrase in this group is within 2 words of any phrase
in the second group".

My understanding is proximity in Lucene is only against individual
phrases, not between two groups. Is there any way to do the functional
equivalent short of

"phrase1 phrasea"~2 "phrase1 phraseb"~2 "phrase2 phrasea"~2 "phrase2
phraseb"~2 etc

The other question I have is "is there a notion of a sentence such that
one could say do these word occur in the same sentence?


Walt Corey

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message