lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael McCandless (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache
Date Fri, 19 Jun 2009 19:50:07 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-1701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721961#action_12721961
] 

Michael McCandless commented on LUCENE-1701:
--------------------------------------------

bq. Static factories are cool (they allow to switch implementations and instantiation logic
without changing API) and are as easy to use (probably even easier with generics in Java5)
as constructors.

Coolness is in the eye of the beholder?

Yes, they are cool in that they give the *developer* (us) future
freedom (to change the actual class returned, re-use instances, use
singletons, etc.), but not cool (in my eyes) for consumability.

Static factory classes are a good fit when the impls really should
remain anonymous because there are trivial differences.  EG the 12
different impls that can be returned by TopFieldCollector.create are a
good example.

But NumericField vs Field, and SortField vs NumericSortField, are
different and should be seen as different to consumer of Lucene's API.

bq. If we add some generic storable flags for Lucene fields, this is cool (probably), NumericField
can then capitalize on it, as well as users writing their own NNNFields.

+1  Wanna make a patch?

Then NumericField would just tap in to this extensibility... and,
somehow, in our future improved search time document() API, have the
ability to make a NumericField.

bq. Why on earth can't my own split-field (vs single-field as in current Lucene) trie-encoded
number enjoy the same benefits as NumericField from Lucene core?

Because.... we've decided that this is our core approach to numerics?

Seriously, I don't see that as unfair.  Trie works well.  We have
chosen it as our way (for now, until something better comes along) of
handling numerics.  Just like we've picked a certain format for the
terms dict and prx file.

Sure, we should make it easy (add extensibility) so external fields
could store stuff in the index, but that doesn't mean we should hold
back on Numeric* consumability until we get that extensibility.

bq. I do use factory methods for all my queries and filters, and it makes me feel warm and
fuzzy!  Under the hood some of them consult FieldInfo to instantiate custom-tailored query
variants, so I just use range(CREATION_TIME, from, to) and don't think if this field is trie-encoded
or raw.

Someday maybe I'll convince you to donate this "schema" layer on top
of Lucene ;) But I hope there are SOME named classes in there and not
all static factory methods returning anonymous untyped impls.

bq. "Simple things should be simple", okay. Complex things should be simple too, argh! 

Whoa, this is all simple stuff?  What should be complex about using
numeric fields in Lucene?  This whole issue is "simple things should
be simple".


> Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache,
move trie parsers to FieldCache
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1701
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1701
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index, Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>
>         Attachments: NumericField.java
>
>
> In discussions about LUCENE-1673, Mike & me wanted to add a new NumericField to o.a.l.document
specific for easy indexing. An alternative would be to add a NumericUtils.newXxxField() factory,
that creates a preconfigured Field instance with norms and tf off, optionally a stored text
(LUCENE-1699) and the TokenStream already initialized. On the other hand NumericUtils.newXxxSortField
could be moved to NumericSortField.
> I and Yonik tend to use the factory for both, Mike tends to create the new classes.
> Also the parsers for string-formatted numerics are not public in FieldCache. As the new
SortField API (LUCENE-1478) makes it possible to support a parser in SortField instantiation,
it would be good to have the static parsers in FieldCache public available. SortField would
init its member variable to them (instead of NULL), so making code a lot easier (FieldComparator
has this ugly null checks when retrieving values from the cache).
> Moving the Trie parsers also as static instances into FieldCache would make the code
cleaner and we would be able to hide the "hack" StopFillCacheException by making it private
to FieldCache (currently its public because NumericUtils is in o.a.l.util).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message