lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: controlled indexing with Lucene
Date Sun, 30 Nov 2008 02:12:30 GMT
I'm not entirely clear what you're trying to accomplish, but there's
a bunch of options.

You could just index the phrases as normal text, possibly with
positionincrementgaps (assuming more than one phrase is
possible in a field in a document). Then you could use SpanQuerys
to accomplish the "simple" form (choosing suitable increment
gaps and suitable distances on your span queries would insure that
hits wouldn't span phrases). You could then use regular queries
to accomplish the non-simple queries as they're pretty much unaffected
by increment gaps.

If that's total gibberish, perhaps a more complete example would allow
a clearer response.

Best
Erick

On Fri, Nov 28, 2008 at 9:45 AM, Amir Hossein Jadidinejad <
amir.jadidi@yahoo.com> wrote:

> Hi,
> I'm going to index some documents only with known phrases. Let me describe:
> Suppose that I have a controlled vocabulary of phrases (A list of some
> candidate phrases). I intend to
> index ONLY these phrases within my documents and have a retrieval model
> (for example simple VS-TF.IDF) that each index item is a predefined
> phrase.
> Could anyone tell me how to do it with Lucene?
> Greatly appreciate any comment or answer.
> Kind regards.
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message