lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yonik Seeley <>
Subject Re: Optimize facets when actually single valued?
Date Sun, 11 Nov 2012 15:09:57 GMT
On Sun, Nov 11, 2012 at 3:33 AM, Robert Muir <> wrote:
> I am guessing at times people are lazy about schema definition. But, I think
> with lucene 4 stats we can detect if a field is actually single valued...
> Something like terms.size == terms.doccount == terms.sumdocfreq. I have to
> think about it a bit, maybe its even simpler than this? Anyway, this couple
> be used instead of actual schema def to just build a fieldcache instead of
> uninverted field I think... Should be a simple opto but maybe potent...

Funny you should mention this now - I was thinking exactly the same
thing on the flight home from ApacheCon!

This "detect single-valued" also has implications for things other
than faceting as well - as you say, people can be lazy about the
schema definition and having things "just work" is a good thing.

I've thought about a more flexible field that acts like a single
valued field when you use it like that, and a multi-valued field
otherwise.  There won't quite be back compat with responses though
(since multiValued fields with single values now look like
"foo":["single_value"] instead of "foo":"single_value".)  Perhaps we
could add something like multiValued=flexible or something (and switch
to that by default), while retaining back compat for
multiValued=true/false.  Either that or bump "version" of the schema
or response.  This is actually pretty important if we ever want to do
more "schema-less" (i.e. type guessing based on input), since it
allows us to only guess type and not have to deal with figuring out
multiValued.  It could lower the numer of dynamic field definitions
necessary and make choosing the correct one simpler.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message