lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Rewrite one phrase to another in search query
Date Wed, 27 Jun 2007 14:57:31 GMT
The synonym analyzer shown in Lucene In Action is a good place
to start. You need to change *all* occurrences of one form into
another, both an index and search time to get consistent results.

There are some "interesting" implications for this, though, but they
only really need to be considered if you need either phrase or
span queries. For instance, let's say you have the following doc
fragments:
doc1: "this is a tcp interaction that I want to deal with"
doc2: "this is a transmission control protocol interaction that I want to
deal with"

is "this" within 4 of "interaction" in both documents? Do you care?

Also, is the phrase "transmission control protocol" match for the
first document? Would the user be confused by matching a document
with "tcp" in it for that phrase?

For that matter, does searching on "transmission" match doc1?
Mostly, these are issues that may or may not be relevant depending
on the intent of the application...

Highlighting also becomes interesting.

Best
Erick


On 6/27/07, Aliaksandr Radzivanovich <aradzivanovich@gmail.com> wrote:
>
> What if I need to search for synonyms, but synonyms can be expanded to
> phrases of several words?
> For example, user enters query "tcp", then my application should also
> find documents containing phrase "Transmission Control Protocol". And
> conversely, user enters "Transmission Control Protocol", then my
> application should also find documents with word "tcp".
>
> It seems like Lucene does not support this scenario out of the box.
> Then where to look for the solution? What Lucene
> extensions/classes/interfaces should I investigate?
>
> Thanks.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message