lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache
Date Sun, 21 Jun 2009 06:28:07 GMT


Uwe Schindler commented on LUCENE-1701:

I'll quote myself, and then attempt to not repeat myself further after this point (the back
and forth is silly).
bq. The next step after adding NumericField seems to be "it's a bug if getDocument() doesn't
return a NumericField, so we must encode it in the index". If that's the case, I'm -1 on adding
NumericField in the first place.

It is mentioned in the docs, that this class is for indexing only:

* <p><b>Please note:</b> This class is only used during indexing. You can
also create
* numeric stored fields with it, but when retrieving the stored field value
* from a {@link Document} instance after search, you will get a conventional
* {@link Fieldable} instance where the numeric values are returned as {@link String}s
* (according to <code>toString(value)</code> of the used data type).

In my opinion: Storing this info in the segments is not doable without pitfalls: If somebody
indexes a normal field name in one IndexWriter session and starts to index using NumericFiled
in the next session, he would have two segments with different encoding and two different
"flags". When these two segments are merged later, what do with the flag?

If we want to have such Schemas, they must be index wide and we have no possibility in Lucene
for that at the moment.

If somebody creates a schema, that can do this (by storing the schema in a separate file next
to the segments file), we can think about it again (with all problems, like: MultiReader on
top of two indexes with different schemas - forbid that because schema different?). All this
says me, we should not do this, it is the task of Solr, my own project panFMP, or Earwin's
own schema, to enforce it.

> Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache,
move trie parsers to FieldCache
> ----------------------------------------------------------------------------------------------------------------------------
>                 Key: LUCENE-1701
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index, Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>         Attachments: LUCENE-1701-test-tag-special.patch, LUCENE-1701.patch, LUCENE-1701.patch,
> In discussions about LUCENE-1673, Mike & me wanted to add a new NumericField to o.a.l.document
specific for easy indexing. An alternative would be to add a NumericUtils.newXxxField() factory,
that creates a preconfigured Field instance with norms and tf off, optionally a stored text
(LUCENE-1699) and the TokenStream already initialized. On the other hand NumericUtils.newXxxSortField
could be moved to NumericSortField.
> I and Yonik tend to use the factory for both, Mike tends to create the new classes.
> Also the parsers for string-formatted numerics are not public in FieldCache. As the new
SortField API (LUCENE-1478) makes it possible to support a parser in SortField instantiation,
it would be good to have the static parsers in FieldCache public available. SortField would
init its member variable to them (instead of NULL), so making code a lot easier (FieldComparator
has this ugly null checks when retrieving values from the cache).
> Moving the Trie parsers also as static instances into FieldCache would make the code
cleaner and we would be able to hide the "hack" StopFillCacheException by making it private
to FieldCache (currently its public because NumericUtils is in o.a.l.util).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message