lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <j...@apache.org>
Subject [jira] Updated: (LUCENE-1590) Stored-only fields automatically enable norms and tf when added to document
Date Tue, 07 Apr 2009 23:23:12 GMT

     [ https://issues.apache.org/jira/browse/LUCENE-1590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Uwe Schindler updated LUCENE-1590:
----------------------------------

    Attachment: LUCENE-1590.patch

Here is it, not fully tested, but seems to work at least for norms and all Lucene Tests pass.
The changes in Field could be left out, the important thing are FieldInfo cahnges.
When a FieldInfo is generated without indexing switched on, all the indexing-only flags are
set to defaults.

The update method, that merges the existing field infos with new ones, only updates the indexing
flags, if the added field is indexed. For all merging that results in an OR operation (all
excl. omitNorms) the c'tors default is false, for all flags that merge with AND (omitNorms),
the c'tors default is true.

The problem is: Luke does not show the omitTf thing, as this flag seems not be loaded by IndexReader,
so I cannot find out if the omitTf was really done in index (Luke does not show this flag
even for fields that were added once with omitTf).

Finally, I wanted to add a test, that exactly does what I have done, and tests if it works.

> Stored-only fields automatically enable norms and tf when added to document
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1590
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1590
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Index
>    Affects Versions: 2.4, 2.4.1, 2.9
>            Reporter: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: LUCENE-1590.patch
>
>
> During updating my internal components to the new TrieAPI, I have seen the following:
> I index a lot of numeric fields with trie encoding omitting norms and term frequency.
This works great. Luke shows that both is omitted.
> As I sometimes also want to have the components of the field stored and want to use the
same field name for it. So I add additionally the field again to the document, but stored
only (as the Field c'tor using a TokenStream cannot additionally store the field). As it is
stored only, I thought, that I can left out explicit setting of omitNorms and omitTermFreqAndPositions.
After adding the stored-only-without-omits field, Luke shows all fields with norms enabled.
I am not sure, if the norms/tf were really added to the index, but Luke shows a value for
the norms and FieldInfo has it enabled.
> In my opinion, this is not intuitive, o.a.l.document.Field  should switch both omit*
options on when storing fields only (and also disable other indexing-only options). Alternatively
the internal FieldInfo.update(boolean isIndexed, boolean storeTermVector, boolean storePositionWithTermVector,
boolean storeOffsetWithTermVector, boolean omitNorms, boolean storePayloads, boolean omitTermFreqAndPositions)
should only change the omit* and other options, if the isIndexed parameter (not this.isIndexed)
is also true, elsewhere leave it as it is.
> In principle, when adding a stored-only field, any indexing-specific options should not
be changed in FieldInfo. If the field was indexed with norms before, norms should stay enabled
(but this would be the default as it is).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message