lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Antony Bowesman <>
Subject Re: Positions in SpanFirst
Date Wed, 21 Feb 2007 20:30:01 GMT
Hi Erick,

> I'm not sure you can, since all the interfaces I use alter the increment
> between successive terms, but I'll be the first to admit that there are 
> many
> nooks and crannies that I don't know about... But I suspect that a negative
> increment is not supported intentionally....

I read your other interesting post about omitting termvector info and this led 
me to find Analyzer.getPositionIncrementGap.  The javadocs state

"Invoked before indexing a Field instance if terms have already been added to 
that field..."

so I thought that sounded good, but there does not seem to be a way to set it 
and most of the Analyzers just seem to use the base Analyzer method which 
returns 0, so I'm now confused as to what this actually does in practice.

> But I really doubt you want to do this due to the consequences. Consider in
> your example the terms would have the following offsets
> first 0
> bit 1
> second 0
> part 1
> third 0
> section 1
> Now think about a proximity query "first section"~1. This would produce a
> hit because you've changed the whole sense of what offsets mean. Is this
> really a good thing?

That's a good point.  The field is used to index mail recipients and currently 
has a "starts with" search (non Lucene implementation).  Unless I can set the 
position increment gap, it is only ever possible to search for the first indexed 
recipient with proxity queries.\

I'm trying to ensure the Lucene implementation provides at least the original 
functionality.  If I can't achieve it I can just document the limitation.  If I 
can, I may get false hits, but I still have the choice to filter the hits and 
weed out the false ones before being given to the client.  It's not a 
showstopper, it would be good it it could be done.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message