lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doron Cohen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3153) Adding field w/ norms should fail if same field was added w/o norms already
Date Tue, 31 May 2011 07:03:47 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13041461#comment-13041461
] 

Doron Cohen commented on LUCENE-3153:
-------------------------------------

I was not clear enough.

I meant that when deciding on consistency of requested NORMS state, if relying only on committed
data, then the handling of add/update requests is in a best effort manner, while the handling
at commit is complete.

So, for this example:

* Index does not contain field F
* doc1 is added with F set to NO NORMS
* doc2 is added with F set to WITH NORMS

I was not sure about the ability to tell that F in doc2 is inconsistent, because of relying
on committed data, and, perhaps, especially with DWPT.

At commit, it is def possible to check this.

Similarly this scenario has same problem:

* Index contains (committed) field F WITH NORMS
* doc1 is added with F set to NO NORMS
* doc2 is added with F set to WITH NORMS

Again, F in doc2, while consistent with F as committed in the index, is inconsistent with
previously added F in doc1.

In this situation, throwing the exception due to inconsistencies might have to be late in
some scenarios (at commit) and hence unacceptable IMO. At the least, such a behavior should
be specifically requested by application, e.g. by setting a STRICT_NORMS mode or something
like that in iwcfg. 

I am not convinced going that far is justified.

> Adding field w/ norms should fail if same field was added w/o norms already
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-3153
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3153
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: core/index
>            Reporter: Shai Erera
>             Fix For: 4.0
>
>
> A spinoff from LUCENE-3146. Consider the following two scenarios, according to how 4.0
currently works:
> * Field "a" is added w/ norms. Sometime later field "a" is added to a document w/o norms
-- norms are disabled for field "a", for all docs.
> * Field "a" is added w/o norms - norms are disabled for field "a". Sometime later field
"a" is added to a document w/ norms -- app thinks norms were added, while in fact they are
dropped.
> This is a bug and case #2 should fail on add/updateDocument - app should know norms were
not added. While case #1 isn't great either, it's the only way an app can choose to disable
norms for field "a", after instances of it already contain norms, so we should support that
scenario.
> In order to detect that early, we should track norms info in .fnx, as Mike describes
at LUCENE-3146. Since this changes the index format, we should also update the "file format"
page after we do it.
> Not sure what's the deal w/ 3.x indexes that are read by 4.0 code. Initially they won't
have .fnx file, so no central norms information exist to detect the cases I've described above.
Over time, as segments are merged, .fnx will include information from more and more segments,
but there's always a chance few segments will still contain the norms for field "a". I'm not
very familiar w/ that part of the code, but I think that:
> * If .fnx says "no norms for field a", the we ignore any norms information that may or
may not exist in segments.
> * If .fnx says "norms for field a", then we need to make up some norms values for (old)
segments w/ no norms? We need to make up values during segment merge and search?

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message