lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: Analyzer.getPositionIncrementGap question
Date Thu, 26 Oct 2006 15:58:37 GMT
See the SynonymFilter in LIA for how to create your very own analyzer that
gives you total control over the increment between terms. Essentially, that
allows you to set the position increment for each and every token. I suspect
that this would be easier, but what do I know?

The difference is that you can set the position increment for each token,
there's no reason to add the field multiple times. That is, for the call

d.add(new Field("value1 value2 value3")

and have the opportunity to set the position increment for all three terms.
Of course, you could make three calls to d.add as well and this technique
should still work.

But if you really need the gap instead, you can simply override the
getPositionIncrementGap method. But I don't know how you get the current
token when it's called.

One warning if you really *do* need getPositionIncrementGap. There's a bug
in 2.0 that if you use a PerFieldAnalyzerWrapper, getPostitionIncrementGap
isn't called properly. There's a fix in the source and the nightly build.
This doesn't (as far as I know) apply to the SynonymFilter strategy.

Hope this helps
Erick

On 10/25/06, qaz zaq <fortques@yahoo.com> wrote:
>
> I have multiple values want to add to the same FIELD, and I also want to
> add non-zero but NON CONSTANT position increment gap among those values.
> e.g., gap between "value1" and "value2" is 10, but gap between "value2"
> and "value3" is 100. is there any how can I achieve that?
>
>   d.add(new Field ("fld", "value1");
>   d.add(new Field ("fld", "value2");
>   d.add(new Field ("fld", "value3");
>   indexWriter.addDocument(d);
>
>
> ---------------------------------
> All-new Yahoo! Mail - Fire up a more powerful email and get things done
> faster.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message