lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Hastings <hastings.recurs...@gmail.com>
Subject Re: solr 5.2->7.2, suggester failure
Date Tue, 03 Apr 2018 18:55:31 GMT
Ah, Thank you
Turns out it was an experiment, so I removed them any ways and its all good
now.

Since Im here in the configuration for the new 7.x instances I was going to
ask a side question.  A lot of my Java properties are old or have  been
tweaked over time from a series of different machines, so at this point its
like a hodge podge collection of settings and im not sure if there are any
blaring holes.  If someone could let me know if there is something i
definitely need to address, that would be awesome.  some of these settings
were from solr 1 all the way to now.. this is running on machines with 142
gb ram, collection indexes around 300gb to 500gb, on 2TB ssds:

-XX:+CMSParallelRemarkEnabled-XX:+CMSScavengeBeforeRemark
-XX:+ParallelRefProcEnabled-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCDateStamps-XX:+PrintGCDetails-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC-XX:+PrintTenuringDistribution
-XX:+UseCMSInitiatingOccupancyOnly-XX:+UseConcMarkSweepGC-XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=50-XX:CMSMaxAbortablePrecleanTime=6000
-XX:ConcGCThreads=4-XX:MaxTenuringThreshold=8-XX:NewRatio=3
-XX:ParallelGCThreads=8-XX:PretenureSizeThreshold=64m-XX:SurvivorRatio=4
-XX:TargetSurvivorRatio=90
-Xloggc:/SSD2TB-01/solr-5.2.1/server/logs/solr_gc.log-Xms50000m-Xmx50000m
-Xss256k-verbose:gc



On Tue, Apr 3, 2018 at 2:50 PM, Kevin Risden <krisden@apache.org> wrote:

> It looks like there were changes in Lucene 7.0 that limited the size of the
> automaton to prevent overflowing the stack.
>
> https://issues.apache.org/jira/browse/LUCENE-7914
>
> The commit being:
> https://github.com/apache/lucene-solr/commit/
> 7dde798473d1a8640edafb41f28ad25d17f25a2d
>
> Kevin Risden
>
> On Tue, Apr 3, 2018 at 1:45 PM, David Hastings <
> hastings.recursive@gmail.com
> > wrote:
>
> > For data, its primarily a lot of garbage, around 200k titles, varying
> > length.  im actually looking through my application now to see if I even
> > still use it or if it was an early experiment.  I am just finding it odd
> > thats its failing in 7 but does fine on 5
> >
> > On Tue, Apr 3, 2018 at 2:41 PM, Erick Erickson <erickerickson@gmail.com>
> > wrote:
> >
> > > What kinds of things go into your title field? On first blush that's a
> > > bit odd for a multi-word title field since it treats the entire input
> > > as a single string. The code is trying to build a large FST to hold
> > > all of this data. Would AnalyzingInfixLookupFactory or similar make
> > > more sense?
> > >
> > > buildOnStartup and buildOnOptimize are other red flags. This means
> > > that every time you start up, the data for the title field is read
> > > from disk and the FST is built (or index if you use a different impl).
> > > On a large corpus this may take many minutes.
> > >
> > > Best,
> > > Erick
> > >
> > > On Tue, Apr 3, 2018 at 11:28 AM, David Hastings
> > > <hastings.recursive@gmail.com> wrote:
> > > > Hey all, I recently got a 7.2 instance up and running, and it seems
> to
> > be
> > > > going well however, I have ran into this when creating one of my
> > indexes,
> > > > and was wondering if anyone had a quick idea right off the top of
> their
> > > > head.
> > > >
> > > > solrconfig:
> > > >
> > > > <searchComponent name="suggest" class="solr.SuggestComponent">
> > > >   <lst name="suggester">
> > > >     <str name="name">fixspell</str>
> > > >     <str name="lookupImpl">FuzzyLookupFactory</str>
> > > >
> > > >         <str name="suggestAnalyzerFieldType">string</str>
> > > >
> > > >     <str name="dictionaryImpl">DocumentDictionaryFactory</str>
> > > >     <str name="field">title</str>
> > > >     <str name="buildOnStartup">true</str>
> > > >     <str name="buildOnOptimize">true</str>
> > > >   </lst>
> > > >
> > > >
> > > > received error:
> > > >
> > > >
> > > > ERROR true
> > > > SuggestComponent
> > > > Exception in building suggester index for: fixspell
> > > > java.lang.IllegalArgumentException: input automaton is too large:
> 1001
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(
> > > Operations.java:1298)
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(
> > > Operations.java:1306)
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(
> > > Operations.java:1306)
> > > >
> > > > .....
> > > >
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(
> > > Operations.java:1306)
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.topoSortStatesRecurse(
> > > Operations.java:1306)
> > > > at
> > > > org.apache.lucene.util.automaton.Operations.
> topoSortStates(Operations.
> > > java:1275)
> > > > at
> > > > org.apache.lucene.search.suggest.analyzing.
> > > AnalyzingSuggester.replaceSep(AnalyzingSuggester.java:292)
> > > > at
> > > > org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.
> > > toAutomaton(AnalyzingSuggester.java:854)
> > > > at
> > > > org.apache.lucene.search.suggest.analyzing.AnalyzingSuggester.build(
> > > AnalyzingSuggester.java:430)
> > > > at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:190)
> > > > at
> > > > org.apache.solr.spelling.suggest.SolrSuggester.build(
> > > SolrSuggester.java:181)
> > > > at
> > > > org.apache.solr.handler.component.SuggestComponent$
> SuggesterListener.
> > > buildSuggesterIndex(SuggestComponent.java:529)
> > > > at
> > > > org.apache.solr.handler.component.SuggestComponent$
> > > SuggesterListener.newSearcher(SuggestComponent.java:511)
> > > > at org.apache.solr.core.SolrCore.lambda$getSearcher$17(
> > > SolrCore.java:2275)
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message