lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alessandro Benedetti <abenede...@apache.org>
Subject Re: realtime get requirements
Date Wed, 13 Jan 2016 09:46:43 GMT
Hi Matteo,
which Solr version are you using ?
Prior to 5.1 , the building of the suggester was happening by default on
startup, causing long waiting times (
https://issues.apache.org/jira/browse/SOLR-6845 ) .

If you are on a Solr >=5.1 I highly discourage the use of
buildOnStartup=true if not a specific requirement.
As Erick was saying  :


>    - *The “buildOnStartup” parameter should be set to “false”*. Really.
>    This can lead to *very* long startup times, many minutes on very large
>    indexes. Do you really want to re-read, decompress and and add the field
>    from *every* document to the suggester *every time you start Solr!* Likely
>    not, but you can if you want to.
>    - *The “buildOnCommit” parameter should be set to “false”*. Really. Do
>    you really want to re-read, decompress and and add the field from
>    *every* document to the suggester *every time you commit!* Likely
>    not, but you can if you want to.
>
> In details, for the *DocumentDictionary* during the building process, for *ALL
> the documents* in the index :
>
>    - *the stored content* of the configured field is read from the disk (
>    * stored="true" *is required for the field to have the Suggester
>    working)
>
>
>    - the compressed content is decompressed ( remember that Solr stores
>    the plain content of a field applying a compression algorithm [3] )
>
>
>    - the suggester data structure is built
>
> We must be really careful here to this sentence :
> "for ALL the documents*" -> no delta dictionary building is happening*


So extra care every time you decide to build the Suggester !


Cheers

On 12 January 2016 at 18:18, Erick Erickson <erickerickson@gmail.com> wrote:

> right, suggester had some bad behavior where it rebuilt on startup despite
> setting the flag to _not_ do that. See:
>
> Some details here:
>
> https://lucidworks.com/blog/2015/03/04/solr-suggester/
>
> Best,
> Erick
>
> On Tue, Jan 12, 2016 at 8:12 AM, Matteo Grolla <matteo.grolla@gmail.com>
> wrote:
> > ok,
> >       suggester was responsible for the long time to load.
> > Thanks
> >
> > 2016-01-12 15:47 GMT+01:00 Matteo Grolla <matteo.grolla@gmail.com>:
> >
> >> Thanks Shawn,
> >>      On a production solr instance some cores take a long time to load
> >> while other of similar size take much less. One of the differences
> between
> >> these cores is the directoryFactory.
> >>
> >> 2016-01-12 15:34 GMT+01:00 Shawn Heisey <apache@elyograg.org>:
> >>
> >>> On 1/12/2016 2:50 AM, Matteo Grolla wrote:
> >>> > and that it works with any directory factory? (Not just
> >>> > NRTCachingDirectoryFactory)
> >>>
> >>> Realtime Get relies on the updateLog to return uncommitted documents,
> >>> and standard Lucene mechanisms to return documents that have already
> >>> been committed.  It should work with any directory.
> >>>
> >>> I would like to know why you're changing the directory.  The only time
> >>> the directory should be changed is if you want to work with something
> >>> exotic like HDFS.  With a typical installation using a typical
> >>> filesystem, NRTCachingDirectoryFactory is absolutely the best option
> and
> >>> should not be replaced with anything else.  The NRT factory uses MMap,
> >>> so there is no need to switch to MMapDirectoryFactory.
> >>>
> >>> Thanks,
> >>> Shawn
> >>>
> >>>
> >>
>



-- 
--------------------------

Benedetti Alessandro
Visiting card : http://about.me/alessandro_benedetti

"Tyger, tyger burning bright
In the forests of the night,
What immortal hand or eye
Could frame thy fearful symmetry?"

William Blake - Songs of Experience -1794 England

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message