lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lee Carroll <lee.a.carr...@googlemail.com>
Subject Re: Trying to understand SOLR memory requirements
Date Mon, 23 Jan 2012 13:23:50 GMT
on selection issue another query to get your additional data (if i
follow what you want)

On 22 January 2012 18:53, Dave <dlauer@gmail.com> wrote:
> I take it from the overwhelming silence on the list that what I've asked is
> not possible? It seems like the suggester component is not well supported
> or understood, and limited in functionality.
>
> Does anyone have any ideas for how I would implement the functionality I'm
> looking for. I'm trying to implement a single location auto-suggestion box
> that will search across multiple DB tables. It would take several possible
> inputs: city, state, country; state,county; or country. In addition, there
> are many aliases for each city, state and country that map back to the
> original city/state/country. Once they select a suggestion, that suggestion
> needs to have certain information associated with it. It seems that the
> Suggester component is not the right tool for this. Anyone have other ideas?
>
> Thanks,
> Dave
>
> On Thu, Jan 19, 2012 at 6:09 PM, Dave <dlauer@gmail.com> wrote:
>
>> That was how I originally tried to implement it, but I could not figure
>> out how to get the suggester to return anything but the suggestion. How do
>> you do that?
>>
>>
>> On Thu, Jan 19, 2012 at 1:13 PM, Robert Muir <rcmuir@gmail.com> wrote:
>>
>>> I really don't think you should put a huge json document as a search term.
>>>
>>> Just make "Brooklyn, New York, United States" or whatever you intend
>>> the user to actually search on/type in as your search term.
>>> put the rest in different fields (e.g. stored-only, not even indexed
>>> if you dont need that) and have solr return it that way.
>>>
>>> On Thu, Jan 19, 2012 at 12:31 PM, Dave <dlauer@gmail.com> wrote:
>>> > In my original post I included one of my terms:
>>> >
>>> > Brooklyn, New York, United States?{ |id|: |2620829|,
>>> > |timezone|:|America/New_York|,|type|: |3|, |country|: { |id| : |229| },
>>> > |region|: { |id| : |3608| }, |city|: { |id|: |2616971|, |plainname|:
>>> > |Brooklyn|, |name|: |Brooklyn, New York, United States| }, |hint|:
>>> > |2300664|, |label|: |Brooklyn, New York, United States|, |value|:
>>> > |Brooklyn, New York, United States|, |title|: |Brooklyn, New York,
>>> United
>>> > States| }
>>> >
>>> > I'm matching on the first part of the term (the part before the ?), and
>>> > then the rest is being passed via JSON into Javascript, then converted
>>> to a
>>> > JSON term itself. Here is my data-config.xml file, in case it sheds any
>>> > light:
>>> >
>>> > <dataConfig>
>>> >  <dataSource type="JdbcDataSource"
>>> >              driver="com.mysql.jdbc.Driver"
>>> >              url=""
>>> >              user=""
>>> >              password=""
>>> >              encoding="UTF-8"/>
>>> >  <document>
>>> >    <entity name="countries"
>>> >            pk="id"
>>> >            query="select p.id as placeid, c.id, c.plainname, c.name,
>>> > p.timezone from countries c, places p where p.regionid = 1 AND p.cityid
>>> = 1
>>> > AND c.id=p.countryid AND p.settingid=1"
>>> >            transformer="TemplateTransformer">
>>> >            <field column="id" name="countryid"/>
>>> >            <field column="plainname" name="countryname"/>
>>> >            <field column="name" name="fullcountryname"/>
>>> >            <field column="placeid" name="place_id"/>
>>> >            <field column="timezone" name="timezone"/>
>>> >            <field column="countryinfo"
>>> template="${countries.plainname}?{
>>> > |id|: |${countries.placeid}|, |timezone|:|${countries.timezone}|,|type|:
>>> > |1|, |country|: { |id| : |${countries.id}|, |plainname|:
>>> > |${countries.plainname}|, |name|: |${countries.plainname}| }, |region|:
>>> {
>>> > |id| : |0| }, |city|: { |id|: |0| }, |hint|: ||, |label|:
>>> > |${countries.plainname}|, |value|: |${countries.plainname}|, |title|:
>>> > |${countries.plainname}| }"/>
>>> >    </entity>
>>> >    <entity name="regions"
>>> >            pk="id"
>>> >            query="select p.id as placeid, p.countryid as countryid,
>>> > c.plainname as countryname, p.timezone as timezone, r.id as regionid,
>>> > r.plainname as regionname, r.population as regionpop from places p,
>>> regions
>>> > r, countries c where r.id = p.regionid AND p.settingid = 1 AND
>>> p.regionid >
>>> > 1 AND p.countryid=c.id AND p.cityid=1 AND r.population > 0"
>>> >            transformer="TemplateTransformer">
>>> >            <field column="regionid" name="regionid"/>
>>> >            <field column="regionname" name="regionname"/>
>>> >            <field column="regionpop" name="regionpop"/>
>>> >            <field column="countryid" name="countryid"/>
>>> >            <field column="timezone" name="timezone"/>
>>> >            <field column="regioninfo" template="${regions.regionname},
>>> > ${regions.countryname}?{ |id|: |${regions.placeid}|,
>>> > |timezone|:|${regions.timezone}|,|type|: |2|, |country|: { |id| :
>>> > |${regions.countryid}| }, |region|: { |id| : |${regions.regionid}|,
>>> > |plainname|: |${regions.regionname}|, |name|: |${regions.regionname},
>>> > ${regions.countryname}|  }, |city|: { |id|: |0| }, |hint|:
>>> > |${regions.regionpop}|, |label|: |${regions.regionname},
>>> > ${regions.countryname}|, |value|: |${regions.regionname},
>>> > ${regions.countryname}|, |title|: |${regions.regionname},
>>> > ${regions.countryname}| }"/>
>>> >    </entity>
>>> >    <entity name="cities"
>>> >            pk="id"
>>> >            query="select c2.id as cityid, c2.plainname as cityname,
>>> > c2.population as citypop, p.id as placeid, p.countryid as countryid,
>>> > c.plainname as countryname, p.timezone as timezone, r.id as regionid,
>>> > r.plainname as regionname from places p, regions r, countries c, cities
>>> c2
>>> > where c2.id = p.cityid AND p.settingid = 1 AND p.regionid > 1 AND
>>> > p.countryid=c.id AND r.id=p.regionid"
>>> >            transformer="TemplateTransformer">
>>> >            <field column="cityid" name="cityid"/>
>>> >            <field column="cityname" name="cityname"/>
>>> >            <field column="citypop" name="citypop"/>
>>> >            <field column="placeid" name="place_id2"/>
>>> >            <field column="regionid" name="regionid"/>
>>> >            <field column="regionname" name="regionname"/>
>>> >            <field column="countryid" name="countryid"/>
>>> >            <field column="plainname" name="countryname"/>
>>> >            <field column="timezone" name="timezone"/>
>>> >            <field column="fullplacename" template="${cities.cityname},
>>> > ${cities.regionname}, ${cities.countryname}?{ |id|: |${cities.placeid}|,
>>> > |timezone|:|${cities.timezone}|,|type|: |3|, |country|: { |id| :
>>> > |${cities.countryid}| }, |region|: { |id| : |${cities.regionid}| },
>>> |city|:
>>> > { |id|: |${cities.cityid}|, |plainname|: |${cities.cityname}|, |name|:
>>> > |${cities.cityname}, ${cities.regionname}, ${cities.countryname}| },
>>> > |hint|: |${cities.citypop}|, |label|: |${cities.cityname},
>>> > ${cities.regionname}, ${cities.countryname}|, |value|:
>>> |${cities.cityname},
>>> > ${cities.regionname}, ${cities.countryname}|, |title|:
>>> |${cities.cityname},
>>> > ${cities.regionname}, ${cities.countryname}| }"/>
>>> >    </entity>
>>> >  </document>
>>> > </dataConfig>
>>> >
>>> >
>>> >
>>> >
>>> > On Thu, Jan 19, 2012 at 11:52 AM, Robert Muir <rcmuir@gmail.com> wrote:
>>> >
>>> >> I don't think the problem is FST, since it sorts offline in your case.
>>> >>
>>> >> More importantly, what are you trying to put into the FST?
>>> >>
>>> >> it appears you are indexing terms from your term dictionary, but your
>>> >> term dictionary is over 1GB, why is that?
>>> >>
>>> >> what do your terms look like? 1GB for 2,784,937 documents does not make
>>> >> sense.
>>> >> for example, all place names in geonames (7.2M documents) creates a
>>> >> term dictionary of 22MB.
>>> >>
>>> >> So there is something wrong with your data importing and/or analysis
>>> >> process, your terms are not what you think they are.
>>> >>
>>> >> On Thu, Jan 19, 2012 at 11:27 AM, Dave <dlauer@gmail.com> wrote:
>>> >> > I'm also seeing the error when I try to start up the SOLR instance:
>>> >> >
>>> >> > SEVERE: java.lang.OutOfMemoryError: Java heap space
>>> >> > at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:344)
>>> >> >  at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:352)
>>> >> > at org.apache.lucene.util.fst.FST$BytesWriter.writeByte(FST.java:975)
>>> >> >  at org.apache.lucene.util.fst.FST.writeLabel(FST.java:395)
>>> >> > at org.apache.lucene.util.fst.FST.addNode(FST.java:499)
>>> >> >  at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:182)
>>> >> > at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:270)
>>> >> >  at org.apache.lucene.util.fst.Builder.add(Builder.java:365)
>>> >> > at
>>> >> >
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionBuilder.buildAutomaton(FSTCompletionBuilder.java:228)
>>> >> >  at
>>> >> >
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionBuilder.build(FSTCompletionBuilder.java:202)
>>> >> > at
>>> >> >
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionLookup.build(FSTCompletionLookup.java:199)
>>> >> >  at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
>>> >> > at
>>> org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
>>> >> >  at
>>> org.apache.solr.spelling.suggest.Suggester.reload(Suggester.java:153)
>>> >> > at
>>> >> >
>>> >>
>>> org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener.newSearcher(SpellCheckComponent.java:675)
>>> >> >  at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1184)
>>> >> > at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>> >> >  at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>> >> > at
>>> >> >
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> >> >  at
>>> >> >
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> >> > at java.lang.Thread.run(Thread.java:662)
>>> >> >
>>> >> >
>>> >> > On Wed, Jan 18, 2012 at 5:24 PM, Dave <dlauer@gmail.com>
wrote:
>>> >> >
>>> >> >> Unfortunately, that doesn't look like it solved my problem.
I built
>>> the
>>> >> >> new .war file, dropped it in, and restarted the server. When
I
>>> tried to
>>> >> >> build the spellchecker index, it ran out of memory again. Is
there
>>> >> anything
>>> >> >> I needed to change in the configuration? Did I need to upload
new
>>> .jar
>>> >> >> files, or was replacing the .war file enough?
>>> >> >>
>>> >> >> Jan 18, 2012 2:20:25 PM org.apache.solr.spelling.suggest.Suggester
>>> build
>>> >> >> INFO: build()
>>> >> >>
>>> >> >>
>>> >> >> Jan 18, 2012 2:22:06 PM org.apache.solr.common.SolrException
log
>>> >> >>  SEVERE: java.lang.OutOfMemoryError: Java heap space
>>> >> >> at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:344)
>>> >> >> at org.apache.lucene.util.ArrayUtil.grow(ArrayUtil.java:352)
>>> >> >>  at
>>> org.apache.lucene.util.fst.FST$BytesWriter.writeByte(FST.java:975)
>>> >> >> at org.apache.lucene.util.fst.FST.writeLabel(FST.java:395)
>>> >> >>  at org.apache.lucene.util.fst.FST.addNode(FST.java:499)
>>> >> >> at org.apache.lucene.util.fst.Builder.compileNode(Builder.java:182)
>>> >> >>  at org.apache.lucene.util.fst.Builder.freezeTail(Builder.java:270)
>>> >> >> at org.apache.lucene.util.fst.Builder.add(Builder.java:365)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionBuilder.buildAutomaton(FSTCompletionBuilder.java:228)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionBuilder.build(FSTCompletionBuilder.java:202)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTCompletionLookup.build(FSTCompletionLookup.java:199)
>>> >> >> at org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
>>> >> >>  at
>>> org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:109)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:174)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>>> >> >>  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1375)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:358)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:253)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>>> >> >>  at
>>> >> >>
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>>> >> >>  at
>>> >> >>
>>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>>> >> >> at
>>> >>
>>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>>> >> >>  at
>>> >> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>>> >> >> at
>>> >> >>
>>> >>
>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>>> >> >> at
>>> >>
>>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>> >> >>  at org.mortbay.jetty.Server.handle(Server.java:326)
>>> >> >> at
>>> >> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>>> >> >>  at
>>> >> >>
>>> >>
>>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>>> >> >> at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549)
>>> >> >>
>>> >> >>
>>> >> >> On Tue, Jan 17, 2012 at 8:59 AM, Robert Muir <rcmuir@gmail.com>
>>> wrote:
>>> >> >>
>>> >> >>> I committed it already: so you can try out branch_3x if
you want.
>>> >> >>>
>>> >> >>> you can either wait for a nightly build or compile from
svn
>>> >> >>> (http://svn.apache.org/repos/asf/lucene/dev/branches/branch_3x/).
>>> >> >>>
>>> >> >>> On Tue, Jan 17, 2012 at 8:35 AM, Dave <dlauer@gmail.com>
wrote:
>>> >> >>> > Thank you Robert, I'd appreciate that. Any idea how
long it will
>>> >> take to
>>> >> >>> > get a fix? Would I be better switching to trunk? Is
trunk stable
>>> >> enough
>>> >> >>> for
>>> >> >>> > someone who's very much a SOLR novice?
>>> >> >>> >
>>> >> >>> > Thanks,
>>> >> >>> > Dave
>>> >> >>> >
>>> >> >>> > On Mon, Jan 16, 2012 at 10:08 PM, Robert Muir <rcmuir@gmail.com>
>>> >> wrote:
>>> >> >>> >
>>> >> >>> >> looks like https://issues.apache.org/jira/browse/SOLR-2888.
>>> >> >>> >>
>>> >> >>> >> Previously, FST would need to hold all the terms
in RAM during
>>> >> >>> >> construction, but with the patch it uses offline
sorts/temporary
>>> >> >>> >> files.
>>> >> >>> >> I'll reopen the issue to backport this to the
3.x branch.
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>> >> On Mon, Jan 16, 2012 at 8:31 PM, Dave <dlauer@gmail.com>
wrote:
>>> >> >>> >> > I'm trying to figure out what my memory needs
are for a rather
>>> >> large
>>> >> >>> >> > dataset. I'm trying to build an auto-complete
system for every
>>> >> >>> >> > city/state/country in the world. I've got
a geographic
>>> database,
>>> >> and
>>> >> >>> have
>>> >> >>> >> > setup the DIH to pull the proper data in.
There are 2,784,937
>>> >> >>> documents
>>> >> >>> >> > which I've formatted into JSON-like output,
so there's a bit
>>> of
>>> >> data
>>> >> >>> >> > associated with each one. Here is an example
record:
>>> >> >>> >> >
>>> >> >>> >> > Brooklyn, New York, United States?{ |id|:
|2620829|,
>>> >> >>> >> > |timezone|:|America/New_York|,|type|: |3|,
|country|: { |id| :
>>> >> |229|
>>> >> >>> },
>>> >> >>> >> > |region|: { |id| : |3608| }, |city|: { |id|:
|2616971|,
>>> >> |plainname|:
>>> >> >>> >> > |Brooklyn|, |name|: |Brooklyn, New York,
United States| },
>>> |hint|:
>>> >> >>> >> > |2300664|, |label|: |Brooklyn, New York,
United States|,
>>> |value|:
>>> >> >>> >> > |Brooklyn, New York, United States|, |title|:
|Brooklyn, New
>>> York,
>>> >> >>> United
>>> >> >>> >> > States| }
>>> >> >>> >> >
>>> >> >>> >> > I've got the spellchecker / suggester module
setup, and I can
>>> >> confirm
>>> >> >>> >> that
>>> >> >>> >> > everything works properly with a smaller
dataset (i.e. just a
>>> >> couple
>>> >> >>> of
>>> >> >>> >> > countries worth of cities/states). However
I'm running into a
>>> big
>>> >> >>> problem
>>> >> >>> >> > when I try to index the entire dataset. The
>>> >> >>> >> dataimport?command=full-import
>>> >> >>> >> > works and the system comes to an idle state.
It generates the
>>> >> >>> following
>>> >> >>> >> > data/index/ directory (I'm including it in
case it gives any
>>> >> >>> indication
>>> >> >>> >> on
>>> >> >>> >> > memory requirements):
>>> >> >>> >> >
>>> >> >>> >> > -rw-rw---- 1 root   root   2.2G Jan 17
00:13 _2w.fdt
>>> >> >>> >> > -rw-rw---- 1 root   root    22M Jan 17
00:13 _2w.fdx
>>> >> >>> >> > -rw-rw---- 1 root   root    131 Jan 17
00:13 _2w.fnm
>>> >> >>> >> > -rw-rw---- 1 root   root   134M Jan 17
00:13 _2w.frq
>>> >> >>> >> > -rw-rw---- 1 root   root    16M Jan 17
00:13 _2w.nrm
>>> >> >>> >> > -rw-rw---- 1 root   root   130M Jan 17
00:13 _2w.prx
>>> >> >>> >> > -rw-rw---- 1 root   root   9.2M Jan 17
00:13 _2w.tii
>>> >> >>> >> > -rw-rw---- 1 root   root   1.1G Jan 17
00:13 _2w.tis
>>> >> >>> >> > -rw-rw---- 1 root   root     20 Jan 17
00:13 segments.gen
>>> >> >>> >> > -rw-rw---- 1 root   root    291 Jan 17
00:13 segments_2
>>> >> >>> >> >
>>> >> >>> >> > Next I try to run the suggest?spellcheck.build=true
command,
>>> and I
>>> >> >>> get
>>> >> >>> >> the
>>> >> >>> >> > following error:
>>> >> >>> >> >
>>> >> >>> >> > Jan 16, 2012 4:01:47 PM
>>> org.apache.solr.spelling.suggest.Suggester
>>> >> >>> build
>>> >> >>> >> > INFO: build()
>>> >> >>> >> > Jan 16, 2012 4:03:27 PM org.apache.solr.common.SolrException
>>> log
>>> >> >>> >> > SEVERE: java.lang.OutOfMemoryError: GC overhead
limit exceeded
>>> >> >>> >> >  at java.util.Arrays.copyOfRange(Arrays.java:3209)
>>> >> >>> >> > at java.lang.String.<init>(String.java:215)
>>> >> >>> >> >  at
>>> org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122)
>>> >> >>> >> > at
>>> >> >>>
>>> org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:184)
>>> >> >>> >> >  at
>>> >> >>>
>>> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:203)
>>> >> >>> >> > at
>>> >> >>>
>>> org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:172)
>>> >> >>> >> >  at
>>> >> >>>
>>> org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:509)
>>> >> >>> >> > at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.index.DirectoryReader.docFreq(DirectoryReader.java:719)
>>> >> >>> >> >  at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:309)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterator.isFrequent(HighFrequencyDictionary.java:75)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.search.spell.HighFrequencyDictionary$HighFrequencyIterator.hasNext(HighFrequencyDictionary.java:125)
>>> >> >>> >> > at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:157)
>>> >> >>> >> >  at
>>> org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
>>> >> >>> >> > at
>>> >> >>>
>>> org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.handler.component.SpellCheckComponent.prepare(SpellCheckComponent.java:109)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:173)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
>>> >> >>> >> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212)
>>> >> >>> >> > at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>>> >> >>> >> > at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
>>> >> >>> >> >  at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
>>> >> >>> >> > at
>>> >> >>>
>>> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>>> >> >>> >> >  at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
>>> >> >>> >> > at org.mortbay.jetty.Server.handle(Server.java:326)
>>> >> >>> >> >  at
>>> >> >>> >>
>>> >> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928)
>>> >> >>> >> >
>>> >> >>> >> >
>>> >> >>> >> > I also get an error if after the dataimport
command
>>> completes, I
>>> >> just
>>> >> >>> >> exit
>>> >> >>> >> > the SOLR process and restart it:
>>> >> >>> >> >
>>> >> >>> >> > Jan 16, 2012 4:06:15 PM org.apache.solr.common.SolrException
>>> log
>>> >> >>> >> > SEVERE: java.lang.OutOfMemoryError: Java
heap space
>>> >> >>> >> > at
>>> org.apache.lucene.util.fst.NodeHash.rehash(NodeHash.java:158)
>>> >> >>> >> > at org.apache.lucene.util.fst.NodeHash.add(NodeHash.java:128)
>>> >> >>> >> >  at
>>> >> org.apache.lucene.util.fst.Builder.compileNode(Builder.java:161)
>>> >> >>> >> > at
>>> >> >>>
>>> org.apache.lucene.util.fst.Builder.compilePrevTail(Builder.java:247)
>>> >> >>> >> >  at org.apache.lucene.util.fst.Builder.add(Builder.java:364)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTLookup.buildAutomaton(FSTLookup.java:486)
>>> >> >>> >> >  at
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.lucene.search.suggest.fst.FSTLookup.build(FSTLookup.java:179)
>>> >> >>> >> > at
>>> org.apache.lucene.search.suggest.Lookup.build(Lookup.java:70)
>>> >> >>> >> >  at
>>> >> >>>
>>> org.apache.solr.spelling.suggest.Suggester.build(Suggester.java:133)
>>> >> >>> >> > at
>>> >> >>>
>>> org.apache.solr.spelling.suggest.Suggester.reload(Suggester.java:153)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener.newSearcher(SpellCheckComponent.java:675)
>>> >> >>> >> > at org.apache.solr.core.SolrCore$3.call(SolrCore.java:1181)
>>> >> >>> >> >  at
>>> >> >>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>> >> >>> >> > at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>> >> >>> >> >  at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>> >> >>> >> > at
>>> >> >>> >> >
>>> >> >>> >>
>>> >> >>>
>>> >>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>> >> >>> >> >  at java.lang.Thread.run(Thread.java:662)
>>> >> >>> >> >
>>> >> >>> >> > Jan 16, 2012 4:06:15 PM org.apache.solr.core.SolrCore
>>> >> >>> registerSearcher
>>> >> >>> >> > INFO: [places] Registered new searcher Searcher@34b0ede5
main
>>> >> >>> >> >
>>> >> >>> >> >
>>> >> >>> >> >
>>> >> >>> >> > Basically this means once I've run a full-import,
I cannot
>>> exit
>>> >> the
>>> >> >>> SOLR
>>> >> >>> >> > process because I receive this error no matter
what when I
>>> restart
>>> >> >>> the
>>> >> >>> >> > process. I've tried with different -Xmx arguments,
and I'm
>>> really
>>> >> at
>>> >> >>> a
>>> >> >>> >> loss
>>> >> >>> >> > at this point. Is there any guideline to
how much RAM I need?
>>> I've
>>> >> >>> got
>>> >> >>> >> 8GB
>>> >> >>> >> > on this machine, although that could be increased
if
>>> necessary.
>>> >> >>> However,
>>> >> >>> >> I
>>> >> >>> >> > can't understand why it would need so much
memory. Could I
>>> have
>>> >> >>> something
>>> >> >>> >> > configured incorrectly? I've been over the
configs several
>>> times,
>>> >> >>> trying
>>> >> >>> >> to
>>> >> >>> >> > get them down to the bare minimum.
>>> >> >>> >> >
>>> >> >>> >> > Thanks for any assistance!
>>> >> >>> >> >
>>> >> >>> >> > Dave
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>> >>
>>> >> >>> >> --
>>> >> >>> >> lucidimagination.com
>>> >> >>> >>
>>> >> >>>
>>> >> >>>
>>> >> >>>
>>> >> >>> --
>>> >> >>> lucidimagination.com
>>> >> >>>
>>> >> >>
>>> >> >>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> lucidimagination.com
>>> >>
>>>
>>>
>>>
>>> --
>>> lucidimagination.com
>>>
>>
>>

Mime
View raw message