lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shai Erera (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3153) Adding field w/ norms should fail if same field was added w/o norms already
Date Tue, 31 May 2011 06:01:50 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041448#comment-13041448
] 

Shai Erera commented on LUCENE-3153:
------------------------------------

The difference between the two is that on add/UpdateDocument, we can fail fast. Upon commit,
it's a failure that happens too late.

So I'm not at all convinced now that we should fail on this. Really, apps shouldn't be fiddling
w/ norms, at least the apps I know of always index a field the same way. I don't know how
common it is for apps to flip the norms bit, and clearly they can only do it one way. So maybe
what we should be doing is:

* Consolidate norms info in .fnx -- that's a good idea irregardless of the issue.
* Have javadocs sort out any confusion -- we don't fail add/updateDoc attempts, just follow
javadocs semantics
* Provide API for apps to disable norms for a field, since that practically the only direction
we want to allow a/ the aforementioned changed.

Hmm ... another scenario hit me as I wrote the above lines:
* App adds a field w/o norms.
* App deletes the document w/ the field
* App adds a field w/ norms -- now what? norms are marked disabled for that field, but the
only document that caused that is deleted.

commit() can be called in between and several documents can be added w/ and w/o norms -- point
is, this just gets complicated. This is another reason IMO to let apps manage norms and trust
that they don't do fiddle w/ norms. The 'disableNorms' API may still be useful for an app
that does not fiddle w/ norms, but decides it does not need norms for a field anymore.

> Adding field w/ norms should fail if same field was added w/o norms already
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-3153
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3153
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Shai Erera
>             Fix For: 4.0
>
>
> A spinoff from LUCENE-3146. Consider the following two scenarios, according to how 4.0
currently works:
> * Field "a" is added w/ norms. Sometime later field "a" is added to a document w/o norms
-- norms are disabled for field "a", for all docs.
> * Field "a" is added w/o norms - norms are disabled for field "a". Sometime later field
"a" is added to a document w/ norms -- app thinks norms were added, while in fact they are
dropped.
> This is a bug and case #2 should fail on add/updateDocument - app should know norms were
not added. While case #1 isn't great either, it's the only way an app can choose to disable
norms for field "a", after instances of it already contain norms, so we should support that
scenario.
> In order to detect that early, we should track norms info in .fnx, as Mike describes
at LUCENE-3146. Since this changes the index format, we should also update the "file format"
page after we do it.
> Not sure what's the deal w/ 3.x indexes that are read by 4.0 code. Initially they won't
have .fnx file, so no central norms information exist to detect the cases I've described above.
Over time, as segments are merged, .fnx will include information from more and more segments,
but there's always a chance few segments will still contain the norms for field "a". I'm not
very familiar w/ that part of the code, but I think that:
> * If .fnx says "no norms for field a", the we ignore any norms information that may or
may not exist in segments.
> * If .fnx says "norms for field a", then we need to make up some norms values for (old)
segments w/ no norms? We need to make up values during segment merge and search?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message