lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: Problem with TermVector offsets and positions not being preserved
Date Fri, 27 Jul 2012 13:24:09 GMT
On Fri, Jul 27, 2012 at 9:10 AM, Andrzej Bialecki <ab@getopt.org> wrote:
>
> Catching up with this thread ... Luke 4.0-ALPHA makes a similar mistake. I
> fixed this in svn (to be released in a week or so) so that:
>
> * Luke now actually checks whether a doc has term vectors for a particular
> field and adjusts the field flags based on the presence/absence of a term
> vector. FieldInfos were not enough to handle some combinations.
>
> * Luke doesn't show the offsets/positions flags in the document view, since
> they are not known in advance. However, the pop-up that shows a term vector
> correctly shows positions and offsets if available (or blanks if not
> available).
>

Thanks Andrzej!

I can't remember what issue we stopped writing those bits (maybe
https://issues.apache.org/jira/browse/LUCENE-3679 ?)... It wasn't
until this email that I remembered it.

But if I recall there might have been problems: I know there was a lot
of sneakiness to try to handle the corner cases so the bits would be
"correct", but nothing in lucene really used these bits... and I don't
think checkindex ever actually validated that if the offsets bit was
set in fieldinfos that at least 1 doc (even if deleted) actually had
them and so on.

The worst part is, I don't actually understand the use case for this
being configurable on a per-document basis for a field, I actually
think this is confusing...

-- 
lucidimagination.com

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message