lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Renee Sun (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-2138) Solr 1.4 takes long time to load cores (memory leak?)
Date Wed, 06 Oct 2010 16:10:31 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12918565#action_12918565
] 

Renee Sun commented on SOLR-2138:
---------------------------------

With Yonik's suggestions, we looked at our solrconfig.xml and there are newSearcher and firstSearcher
hook ups:

<listener event="firstSearcher" class="solr.QuerySenderListener"> 
<arr name="queries"> 
<lst> 
<str name="q">type:message</str> 
<str name="start">0</str> 
<str name="rows">10</str> 
<str name="sort">message_date desc</str> 
</lst> 
</arr> 
</listener> 

newSearcher has exactly same query.

After we commented these out, the cores were loaded up in 1 minute.

Here is Yonik's post regarding this:

"The sort field message_date is what will be taking up the memory. 

Starting with Lucene 2.9 (which is used in Solr 1.4), searching and 
sorting is per-segment. 
This is generally beneficial, but in this case I believe it is causing 
the extra memory usage because the same date value that would have 
been shared across all documents in the fieldcache is now repeated in 
each segment it is used in. 

One potential fix (that requires you to reindex) is to use the "date" 
fieldType as defined in the new 1.4 schema: 
    <fieldType name="date" class="solr.TrieDateField" omitNorms="true" 
precisionStep="0" positionIncrementGap="0"/> 

This will use 8 bytes per document in your index, rather than 4 bytes 
per doc + an array of unique string-date values per index. 

Trunk (4.0-dev) is also much more efficient at storing string-based 
fields in the FieldCache - but that will only help you if you're 
comfortable with using development versions. "

We can not use the "solr.TrieDateField" since re-index is not an option for us.


> Solr 1.4 takes long time to load cores (memory leak?)
> -----------------------------------------------------
>
>                 Key: SOLR-2138
>                 URL: https://issues.apache.org/jira/browse/SOLR-2138
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 1.4
>         Environment: 8 processors Intel(R) Xeon(R) CPU           E5345  @ 2.33GHz
>            Reporter: Renee Sun
>
> 1. 32 GB total memory, with 16GB allocated to Solr server.
> 2. 130 cores, with most cores having 50,000 documents, and 2,3 cores having 1~2.4Million
documents (largest core takes about 11GB disk space)
> 3. in solr 1.3, there was no problem, it took 5,6 minutes to load up all cores.
> 4. just upgrade to solr 1.4, it takes about 45+ minutes to load all 130 cores.
> 5. no solrconfig or schema change
> 6. autowarmCount="0" for all caches
> I have monitored the memory with JConsole. The 'queryConverter' warning in catalina.out
file helped me figured out that when about 70 cores were loaded, the memory usage went from
300MB to 16GB, and stay at that level. Rest of the cores are loaded up extremely slow.
> I find Yonik's fix note for SOLR-1797: fix ConcurrentModificationException and potential
memory
>   leaks in ResourceLoader. (yonik)
> We are in process of upgrading to 1.4.1

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message