lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jay Potharaju <>
Subject Re: solr multicore vs sharding vs 1 big collection
Date Sun, 02 Aug 2015 23:29:46 GMT
Thanks for the feedback. I agree that increasing timeout might alleviate
the timeout issue. The main problem with increasing timeout is the
detrimental effect it will have on the user experience, therefore can't
increase it.
I have looked at the queries that threw errors, next time I try it
everything seems to work fine. Not sure how to reproduce the error.
My concern with increasing the memory to 32GB is what happens when the
index size grows over the next few months.
One of the other solutions I have been thinking about is to rebuild
index(weekly) and create a new collection and use it. Are there any good
references for doing that?

On Sun, Aug 2, 2015 at 10:19 AM, Shawn Heisey <> wrote:

> On 8/2/2015 8:29 AM, Jay Potharaju wrote:
> > The document contains around 30 fields and have stored set to true for
> > almost 15 of them. And these stored fields are queried and updated all
> the
> > time. You will notice that the deleted documents is almost 30% of the
> > docs.  And it has stayed around that percent and has not come down.
> > I did try optimize but that was disruptive as it caused search errors.
> > I have been playing with merge factor to see if that helps with deleted
> > documents or not. It is currently set to 5.
> >
> > The server has 24 GB of memory out of which memory consumption is around
> 23
> > GB normally and the jvm is set to 6 GB. And have noticed that the
> available
> > memory on the server goes to 100 MB at times during a day.
> > All the updates are run through DIH.
> Using all availble memory is completely normal operation for ANY
> operating system.  If you hold up Windows as an example of one that
> doesn't ... it lies to you about "available" memory.  All modern
> operating systems will utilize memory that is not explicitly allocated
> for the OS disk cache.
> The disk cache will instantly give up any of the memory it is using for
> programs that request it.  Linux doesn't try to hide the disk cache from
> you, but older versions of Windows do.  In the newer versions of Windows
> that have the Resource Monitor, you can go there to see the actual
> memory usage including the cache.
> > Every day at least once i see the following error, which result in search
> > errors on the front end of the site.
> >
> > ERROR org.apache.solr.servlet.SolrDispatchFilter -
> >
> >
> > From what I have read these are mainly due to timeout and my timeout is
> set
> > to 30 seconds and cant set it to a higher number. I was thinking maybe
> due
> > to high memory usage, sometimes it leads to bad performance/errors.
> Although this error can be caused by timeouts, it has a specific
> meaning.  It means that the client disconnected before Solr responded to
> the request, so when Solr tried to respond (through jetty), it found a
> closed TCP connection.
> Client timeouts need to either be completely removed, or set to a value
> much longer than any request will take.  Five minutes is a good starting
> value.
> If all your client timeout is set to 30 seconds and you are seeing
> EofExceptions, that means that your requests are taking longer than 30
> seconds, and you likely have some performance issues.  It's also
> possible that some of your client timeouts are set a lot shorter than 30
> seconds.
> > My objective is to stop the errors, adding more memory to the server is
> not
> > a good scaling strategy. That is why i was thinking maybe there is a
> issue
> > with the way things are set up and need to be revisited.
> You're right that adding more memory to the servers is not a good
> scaling strategy for the general case ... but in this situation, I think
> it might be prudent.  For your index and heap sizes, I would want the
> company to pay for at least 32GB of RAM.
> Having said that ... I've seen Solr installs work well with a LOT less
> memory than the ideal.  I don't know that adding more memory is
> necessary, unless your system (CPU, storage, and memory speeds) is
> particularly slow.  Based on your document count and index size, your
> documents are quite small, so I think your memory size is probably good
> -- if the CPU, memory bus, and storage are very fast.  If one or more of
> those subsystems aren't fast, then make up the difference with lots of
> memory.
> Some light reading, where you will learn why I think 32GB is an ideal
> memory size for your system:
> It is possible that your 6GB heap is not quite big enough for good
> performance, or that your GC is not well-tuned.  These topics are also
> discussed on that wiki page.  If you increase your heap size, then the
> likelihood of needing more memory in the system becomes greater, because
> there will be less memory available for the disk cache.
> Thanks,
> Shawn

Jay Potharaju

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message