lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <benedetti.ale...@gmail.com>
Subject Re: docValues in solr/lucene 4.8.x
Date Fri, 05 Jun 2015 16:03:11 GMT
I would like to add this , to Shawn description :

DocValues are only available for specific field types. The types chosen
> determine the underlying Lucene docValue type that will be used. The
> available Solr field types are:
>
>    - StrField and UUIDField.
>    - If the field is single-valued (i.e., multi-valued is false), Lucene
>       will use the SORTED type.
>       - If the field is multi-valued, Lucene will use the SORTED_SET type.
>    - Any Trie* numeric fields and EnumField.
>    - If the field is single-valued (i.e., multi-valued is false), Lucene
>       will use the NUMERIC type.
>       - If the field is multi-valued, Lucene will use the SORTED_SET type.
>
> These Lucene types are related to how the values are sorted and stored.
>

Keep it in mind when designing your schema .

Furthermore, but I guess it's obvious , always evaluate the cardinality of
the field you want to facet on.
For low cardinality it is not even necessary to build docValues.

The facet.method parameter selects the type of algorithm or method Solr
> should use when faceting a field.
>
> Setting
>
> Results
>
> enum
>
> Enumerates all terms in a field, calculating the set intersection of
> documents that match the term with documents that match the query. This
> method is recommended for faceting multi-valued fields that have only a few
> distinct values. The average number of values per document does not matter.
> For example, faceting on a field with U.S. States such as Alabama,
> Alaska, ... Wyoming would lead to fifty cached filters which would be
> used over and over again. The filterCache should be large enough to hold
> all the cached filters.
>
> fc
>
> Calculates facet counts by iterating over documents that match the query
> and summing the terms that appear in each document. This is currently
> implemented using an UnInvertedField cache if the field either is
> multi-valued or is tokenized (according to FieldType.isTokened()). Each
> document is looked up in the cache to see what terms/values it contains,
> and a tally is incremented for each value. This method is excellent for
> situations where the number of indexed values for the field is high, but
> the number of values per document is low. For multi-valued fields, a hybrid
> approach is used that uses term filters from the filterCache for terms
> that match many documents. The letters fc stand for field cache.
>
> fcs
>
> Per-segment field faceting for single-valued string fields. Enable with
> facet.method=fcs and control the number of threads used with the threads local
> parameter. This parameter allows faceting to be faster in the presence of
> rapid index changes.
>
> The default value is fc (except for fields using the BoolField field
> type) since it tends to use less memory and is faster when a field has many
> unique terms in the index.
>
> This parameter can be specified on a per-field basis with the syntax of
> f.<fieldname>.facet.method.
>

All the info are coming from Solr official wiki.

Cheers


2015-06-05 7:15 GMT+01:00 Shawn Heisey <apache@elyograg.org>:

> On 6/4/2015 11:42 PM, pras.venkatesh wrote:
> > I see docValues has been there since Lucene 4.0. so can I use docValues
> with
> > my current solr cloud version of 4.8.x
> >
> > The reason I am asking is because, I have deployment mechanism and
> securing
> > the index (using Tomcat valve) all built out based on Tomcat which I need
> > figure out all the way again with Jetty.
> >
> > so thinking if I could use docValues with solr/lucene 4.8.x in order to
> > perform sort/facet queries effectively(consuming less heap memory)
>
> Solr 4.8 can do docValues.  To enable the feature on a field, you just
> need to change field definition in schema.xml to include docValues="true".
>
> Note that you need to completely reindex.  After you make the change and
> restart or reload, sorting and facets will NOT work until the reindex is
> done, because when docValues is present in the schema, Solr will try to
> use docValues, and that data will not be present unless you reindex.
>
> https://wiki.apache.org/solr/HowToReindex
>
> Thanks,
> Shawn
>
>


-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message