lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1590) Stored-only fields automatically enable norms and tf when added to document
Date Thu, 09 Apr 2009 09:08:12 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12697440#action_12697440
] 

Michael McCandless commented on LUCENE-1590:
--------------------------------------------

bq. Maybe merge with the already existing FieldInfos/FieldInfo methods.

And we should think about flexible indexing, ie, make FieldInfo extensible.  I think there
are two separate questions, here:

  * What API doe we expose for the "schema" (FieldInfo/s)?

  * How to handle the fact that each segment has its own "schema" (hide it, by virtually merging
the way SegmentMerger would, or, expose it)?

bq. A new case for this would be good, after thinking a little bit about it, I may open one.
But in general it should be combined with the Document/Fields redesign.

I agree, a new issue.

bq. Yes, and it will not work. I think, we leave the patch as it is, maybe remove the omitTf
and omitNorms update for binary fields. Binary fields are "special".

Do you want to make a new patch (removing omit* update for binary fields)?

> Stored-only fields automatically enable norms and tf when added to document
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1590
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4, 2.4.1, 2.9
>            Reporter: Uwe Schindler
>            Assignee: Michael McCandless
>             Fix For: 2.9
>
>         Attachments: LUCENE-1590.patch, LUCENE-1590.patch, LUCENE-1590.patch
>
>
> During updating my internal components to the new TrieAPI, I have seen the following:
> I index a lot of numeric fields with trie encoding omitting norms and term frequency.
This works great. Luke shows that both is omitted.
> As I sometimes also want to have the components of the field stored and want to use the
same field name for it. So I add additionally the field again to the document, but stored
only (as the Field c'tor using a TokenStream cannot additionally store the field). As it is
stored only, I thought, that I can left out explicit setting of omitNorms and omitTermFreqAndPositions.
After adding the stored-only-without-omits field, Luke shows all fields with norms enabled.
I am not sure, if the norms/tf were really added to the index, but Luke shows a value for
the norms and FieldInfo has it enabled.
> In my opinion, this is not intuitive, o.a.l.document.Field  should switch both omit*
options on when storing fields only (and also disable other indexing-only options). Alternatively
the internal FieldInfo.update(boolean isIndexed, boolean storeTermVector, boolean storePositionWithTermVector,
boolean storeOffsetWithTermVector, boolean omitNorms, boolean storePayloads, boolean omitTermFreqAndPositions)
should only change the omit* and other options, if the isIndexed parameter (not this.isIndexed)
is also true, elsewhere leave it as it is.
> In principle, when adding a stored-only field, any indexing-specific options should not
be changed in FieldInfo. If the field was indexed with norms before, norms should stay enabled
(but this would be the default as it is).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message