lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Global field semantics
Date Mon, 10 Jul 2006 19:31:44 GMT

: previously mentioned a very simple one:  validating fields in the query
: parser.  More interesting examples are:

This strikes me as something that can be done with an abstraction layer
above and seperate from the physical index (this is in fact what Solr
does) without needing to add any hard constraints on the index itself
(other then those impossed by the abstraction layer)

:   1.  Multiple inheritance on the fields of documents that record the
: sources of each inherited value to support efficient incremental maintenance

I'm sorry, you completely lost me ... can you clarify what you mean?

:   2.  "Record-valued fields" that store facets with values (e.g., time
: and user information for who set that value).  These cannot easily be
: broken into multiple fields because the fields in question are multi-valued.
:   3.  "Join fields" that reference id's of objects stored in separate
: indices (supporting queries that reference the fields in the joined index)

Both of these cases sound like situations where what you really want is
more flexibility in the Fields/Terms that can be associated with a docId
-- in the case of your "Record-valued fields" you want what I can only
think of as "rich terms", hierarchical data that can be queried ... along
the lines of the "FlexibleIndexing" wiki page correct? ... this doesn't
seem like it would require a more concrete Field rules, but i can
certianly see how an added level of abstraction might help.

: Managing these kinds of rich semantic features in query parsing and
: indexing is greatly facilitated by a global field model.  I've built
: this into my app, and then started thinking about benefits in Lucene
: generally from such a model.


So i guess we are on the same page that this kind of thing can be done at
the App level -- what benefits do you see moving them into the Lucene
index level?

(I imagine it making the most sense as a contrib-ish auxillary API that
developers can use when they don't need the full flexibility the low level
API allows ... but it sounds like you think there are functional benefits
to it being a first order concept in the Lucene API?)

: Yes.  Here is (an elaboration of) the "global model with exceptions"
: idea we reached:

if there can be exceptions then there can't be any hard constraints in the
data store, correct? ... so an implimentation like this could be a higher
level API?

: >   docA.add(new Field(f, "bar", Store.YES, Index.UN_TOKENIZED)):
: >   docA.add(new Field(f, "foo", Store.NO,  Index.TOKENIZED)):
: >
: >   docB.add(new Field(f, "x y", Store.YES, Index.TOKENIZED)):
: >   docB.add(new Field(f, "z",   Store.NO,  Index.UN_TOKENIZED)):

: Hoss, do you have a use case requiring Store and Index variance like this?

Not to that extreme, but i have certainly encountered situations where
storing a single value while indexing multiple values was needed -- this
is something Solr's schema can't handle actually, and we had to work
arround it by using two fields.  I've also seen situations where it would
make a lot of sense to not only do that with one doc, but to also indexing
a single value and storing multiple values in a different doc.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message