lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Earwin Burrfoot (JIRA)" <>
Subject [jira] Commented: (LUCENE-1701) Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache, move trie parsers to FieldCache
Date Fri, 19 Jun 2009 22:20:07 GMT


Earwin Burrfoot commented on LUCENE-1701:

bq. Someday maybe I'll convince you to donate this "schema" layer on top of Lucene
It's not generic enough to be of use for every user of Lucene, and it doesn't aim to be such.
It also evolves, and donating something to Lucene means casting it in concrete.
So that's not me being greedy or lazy (okay, maybe a little bit of the latter), it's simply
not public-quality (as I understand it) code.
I can share the design if anybody's interested, but everyone's coping with it themselves it

Solr has its own schema approach, and it has its merits and downfalls compared to mine. That's
what is nice, we're able to use the same library in differing ways, and it doesn't force its
sense of 'best practices' on us. 

bq. But I hope there are SOME named classes in there and not all static factory methods returning
anonymous untyped impls.
SOME of them aren't static :-D

bq. We shouldn't weaken trie's integration to core just because others have private implementations.
You shouldn't integrate into core something that is not core functionality. Think microkernels.
It's strange seeing you drive CSFs, custom indexing chains, pluggability everywhere on one
side, and trying to add some weird custom properties into index that are tightly interwoven
with only one of possible numeric implementations on the other side.

bq. Design for today.
And spend two years deprecating and supporting today's designs after you get a better thing
tomorrow. Back-compat Lucene-style and agile design aren't something that marries well.

bq. What's important is that we don't weaken those private implementations with trie's addition,
and I don't think our approach here has done that.
You're weakening Lucene itself by introducing too much coupling between its components.

IndexReader/Writer pair is a good example of what I'm arguing against. A dusty closet of microfeatures
that are tightly interwoven into a complex hard-to-maintain mess with zillions of (possibly
broken) control paths - remember mutable deletes/norms+clone/reopen permutations? It could
be avoided if IR/W were kept to the bare minimum (which most people are going to use), and
more advanced features were built on top of it, not in the same place.

NRT seems to tread the same path, and I'm not sure it's going to win that much turnaround
time after newly-introduced per-segment collection. Some time ago I finished a first version
of IR plugins, and enjoy pretty low reopen times (field/facet/filter cache warmups included).
(Yes, I'm going to open an issue for plugins once they stabilize enough)

> If we add some generic storable flags for Lucene fields, this is cool (probably), NumericField
can then capitalize on it, as well as users writing their own NNNFields.
+1 Wanna make a patch?

No, I'd like to continue IR cleanup and play with positionIncrement companion value that could
enable true multiword synonyms. 
I know, I know, it's do-a-cracy. But it's not an excuse for hacks.

> Add NumericField and NumericSortField, make plain text numeric parsers public in FieldCache,
move trie parsers to FieldCache
> ----------------------------------------------------------------------------------------------------------------------------
>                 Key: LUCENE-1701
>                 URL:
>             Project: Lucene - Java
>          Issue Type: New Feature
>          Components: Index, Search
>    Affects Versions: 2.9
>            Reporter: Uwe Schindler
>            Assignee: Uwe Schindler
>             Fix For: 2.9
>         Attachments:
> In discussions about LUCENE-1673, Mike & me wanted to add a new NumericField to o.a.l.document
specific for easy indexing. An alternative would be to add a NumericUtils.newXxxField() factory,
that creates a preconfigured Field instance with norms and tf off, optionally a stored text
(LUCENE-1699) and the TokenStream already initialized. On the other hand NumericUtils.newXxxSortField
could be moved to NumericSortField.
> I and Yonik tend to use the factory for both, Mike tends to create the new classes.
> Also the parsers for string-formatted numerics are not public in FieldCache. As the new
SortField API (LUCENE-1478) makes it possible to support a parser in SortField instantiation,
it would be good to have the static parsers in FieldCache public available. SortField would
init its member variable to them (instead of NULL), so making code a lot easier (FieldComparator
has this ugly null checks when retrieving values from the cache).
> Moving the Trie parsers also as static instances into FieldCache would make the code
cleaner and we would be able to hide the "hack" StopFillCacheException by making it private
to FieldCache (currently its public because NumericUtils is in o.a.l.util).

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message