lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kelly, Frank" <frank.ke...@here.com>
Subject Re: Solr Heap Dump: Any suggestions on what to look for?
Date Fri, 10 Feb 2017 16:18:49 GMT
To clarify 


"we put ³docValues²=³true² on the schema” should have said
"we put ³docValues²=³true² on the id field only”

-Frank

On 2/10/17, 10:27 AM, "Kelly, Frank" <frank.kelly@here.com> wrote:

>Thanks Shawn,
>
>Yeah think we have identified root cause thanks to some of the suggestions
>here.
>
>Originally we stopped using deleteByQuery as we saw it caused some large
>CPU spikes (see 
>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.
>apache.org%2Fjira%2Fbrowse%2FLUCENE-7049&data=01%7C01%7C%7Cd9606e62fa5a421
>a08d008d451c95f04%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=5LhJ4eWQY1s
>tkH0vyMm6c5kzeOcpjOXLtzU5gql6TT8%3D&reserved=0) and
>Solr pauses
>And switched to using a search and then deleteById. It worked fine on our
>(small) test collections.
>
>But with 200M documents it appears that deleteById causes the heap to
>increase dramatically (we guess fieldCache gets populated with a large
>number of object ids?)
>To confirm our suspicion we put ³docValues²=³true² on the schema and began
>to reindex and the heap memory usage dropped significantly - in fact heap
>memory usage on the Solr VMs dropped by a half.
>
>Can someone confirm (or deny) our suspicion that deleteById results in
>some on-heap caching of the unique key (id?)?
>
>
>Cheers!
>
>-Frank
>
>P.s. Interesting when I searched the Wiki for docs on deleteById I did not
>find any
>https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.a
>pache.org%2Fconfluence%2Fdosearchsite.action%3Fwhere%3Dsolr%26spaceSea&dat
>a=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4034cd72254f72b85391fe
>aea64919%7C1&sdata=ixHi%2BZ%2B5wlqQ3tu%2FSQCcgjqPIfMRA2ta7Uo%2BBvwEUxE%3D&
>reserved=0
>rch=true&queryString=deleteById
>
>
>P.p.s Separately we are also turning off FilterCache but we know from
>usage and plugin stats that it is not in use but best to turn it off
>entirely for risk reduction
>
> 
>Frank Kelly
>Principal Software Engineer
> 
>HERE 
>5 Wayside Rd, Burlington, MA 01803, USA
>42° 29' 7" N 71° 11' 32" W
> 
> 
><https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2F360.her
>e.com%2F&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4034cd7225
>4f72b85391feaea64919%7C1&sdata=R%2BAbWMlSJ%2FRN0oAF3smwJawoQGr4U4%2BFdKCxy
>XWLXIg%3D&reserved=0>
><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.tw
>itter.com%2Fhere&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d40
>34cd72254f72b85391feaea64919%7C1&sdata=qnVxW4o1CDcnjOiKdqjhCddGHUqbVlZuvxp
>zMxRme0s%3D&reserved=0>
><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.fa
>cebook.com%2Fhere&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7C6d4
>034cd72254f72b85391feaea64919%7C1&sdata=YaluC4BvPWpKhe5HQ8aaJqy7eW4SIOEdls
>8tNp63xV0%3D&reserved=0>
><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.li
>nkedin.com%2Fcompany%2Fheremaps&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d4
>51c95f04%7C6d4034cd72254f72b85391feaea64919%7C1&sdata=jLfR0kUX4yDZ29FeJEN5
>2jRUxYAPOEaXqoq3L67xSBk%3D&reserved=0>
><https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.in
>stagram.com%2Fhere%2F&data=01%7C01%7C%7Cd9606e62fa5a421a08d008d451c95f04%7
>C6d4034cd72254f72b85391feaea64919%7C1&sdata=xKrwI%2BcUq0sSNf%2FUUdiz9GA%2B
>ckjttBO61qCk1%2BwlsTk%3D&reserved=0>
>
>
>
>On 2/9/17, 11:00 AM, "Shawn Heisey" <apache@elyograg.org> wrote:
>
>>On 2/9/2017 6:19 AM, Kelly, Frank wrote:
>>> Got a heap dump on an Out of Memory error.
>>> Analyzing the dump now in Visual VM
>>>
>>> Seeing a lot of byte[] arrays (77% of our 8GB Heap) in
>>>
>>>   * TreeMap$Entry
>>>   * FieldCacheImpl$SortedDocValues
>>>
>>> We¹re considering switch over to DocValues but would rather be
>>> definitive about the root cause before we experiment with DocValues
>>> and require a reindex of our 200M document index
>>> In each of our 4 data centers.
>>>
>>> Any suggestions on what I should look for in this heap dump to get a
>>> definitive root cause?
>>>
>>
>>Analyzing the cause of large memory allocations when the large
>>allocations are byte[] arrays might mean that it's a low-level class,
>>probably in Lucene.  Solr will likely have almost no influence on these
>>memory allocations, except by changing the schema to enable docValues,
>>which changes the particular Lucene code that is called.  Note that
>>wiping the index and rebuilding it from scratch is necessary when you
>>enable docValues.
>>
>>Another possible source of problems like this is the filterCache.  A 200
>>million document index (assuming it's all on the same machine) results
>>in filterCache entries that are 25 million bytes each.  In Solr
>>examples, the filterCache defaults to a size of 512.  If a cache that
>>size on a 200 million document index fills up, it will require nearly 13
>>gigabytes of heap memory.
>>
>>Thanks,
>>Shawn
>>
>

Mime
View raw message