lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <>
Subject Re: Painfully slow transfer speed from Solr
Date Tue, 22 Nov 2011 05:46:11 GMT
On 11/21/2011 10:19 PM, Stephen Powis wrote:
> Thanks for the reply Shawn.
> The solr server currently has 8gb of ram and the total size of the dataDir
> is around 30gb.  I start solr and give the java heap up to 4gb of ram, so
> that leaves 4gb for the OS, there are no other running services on the
> box.  So from what you are saying, we are way under on the amount of ram we
> would ideally have.
> Just trying to get a better understanding of this.....Wouldn't the indexes
> not being in the disk cache make the queries themselves slow as well (high
> qTime), not just fetching the results?

Having 4GB for disk cache isn't much compared to 30GB, but it's probably 
enough to get most of the important bits cached.  On the other hand, 
unless your query is really complex, 1 second is a pretty slow response 
time.  (Yonik said the same thing, only better.)

> We currently store all the fields that we index, my reasoning behind that
> is that debugging results we get from solr w/o being able to see what is
> stored in solr would be near impossible (in my head anyhow..).  Generally
> our original source (mysql) and solr are consistent, but we've had cases
> where some updates have been missed for one reason or another.
> So my options are: reduce index sizes, increase ram on the server, increase
> disk speed (SSD drives)?

You're right about debugging being harder if you don't see all the data 
in the result.  It's something I have to deal with in my index.  Each of 
my index shards is already 20GB in size, it would be easily triple that 
if I stored everything.

One thing you can do to help with debugging is include faceting on the 
field that you are trying to debug.  You won't see the original field 
values, but you will see what your index analyzer chain has done with 
it, which sometimes is even more useful.

SSD is an awesome option, but it becomes absolutely critical that you 
have redundant servers if you go that route.  This is because there are 
few (maybe none) RAID1, RAID5, or RAID10 solutions that support TRIM, 
which is absolutely required for good SSD performance.  With RAID0 or no 
RAID, a single SSD failure takes the server out.  Things can get very 
expensive very quickly with SSD.

With your current index size, if you can bump the server RAM to 32GB, 
your performance will be very good.  If you can go higher, it would be 
stellar.  You'll want to be running a 64-bit OS and 64-bit JVM.


View raw message