lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shawn Heisey <s...@elyograg.org>
Subject Re: SolrCloud loadbalancing, replication, and failover
Date Thu, 31 Jul 2014 16:05:19 GMT
On 7/31/2014 12:58 AM, shussain@del.aithent.com wrote:
> Thanks for giving great explanation about the memory requirements. Could you tell be
what all parameters that I need to change in my SolrConfig.xml to handle large index size.
What are the optimal values that I need to use.
>
> My indexed data size is 65 GB (for 8.6 million documents) and I am having 48 GB RAM on
my server. Whenever I perform delta-indexing, the server become unresponsive while updating
the index. 
>
> Following are the changes that I did in solrconfig.xml after going through net
> <writeLockTimeout>60000</writeLockTimeout>
> <ramBufferSizeMB>256</ramBufferSizeMB>
> <useCompoundFile>false</useCompoundFile>
> <maxBufferedDocs>1000</maxBufferedDocs>
>
>  <mergePolicy class="org.apache.lucene.index.TieredMergePolicy">
>           <int name="maxMergeAtOnce">10</int>
>           <int name="segmentsPerTier">10</int>
>  </mergePolicy>
>  
> <mergeFactor>10</mergeFactor>
> <mergeScheduler class="org.apache.lucene.index.ConcurrentMergeScheduler"/>
>
> <lockType>simple</lockType>
> <unlockOnStartup>true</unlockOnStartup>
>
> <updateHandler class="solr.DirectUpdateHandler2">
>   <autoCommit>
>     <maxDocs>15000</maxDocs>
>     <openSearcher>true</openSearcher>
>   </autoCommit>
>   <updateLog>
>   <str name="dir">${solr.data.dir:}</str>
>  </updateLog>
> </updateHandler>
>
> So, please provide your valuable suggestion on this problem

You replied directly to me, not to the list.  I am redirecting this back
to the list.

One of the first things that I would do is change openSearcher to false
in your autoCommit settings.  This will mean that you must take care of
commits yourself when you index, to make documents visible.  If you want
any more suggestions, we'll need to see the entire solrconfig.xml file.

The fact that you don't have enough RAM to cache your whole index could
be a problem.  If 8.6 million documents results in 65GB of index, then
your documents are probably quite large, and that can lead to other
possible challenges, because it usually means that a lot of work must be
done to index a single document.  There are also probably a lot of terms
to match when querying.

I do not know how much of your 48GB has been allocated to the java heap,
which takes away from memory that the operating system can use to cache
index files.

Thanks,
Shawn


Mime
View raw message