lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Jelsma <markus.jel...@openindex.io>
Subject RE: Solr uses lots of shared memory!
Date Mon, 04 Sep 2017 11:03:57 GMT
Hello Kevin, Rick,

These are interesting points indeed. But this thread is about shared memory, not virtual memory.

Any value higher than 0 for MALLOC_ARENA_MAX only reduces virtual memory consumption, from
22 GB to 16 GB. There are no differences in shared memory nor resident memory.

Although intrestesting, it unfortunately does not answer the question.

To answer Rick's question, the difference with MALLOC_ARENA_MAX=2 is less virtual memory but
no changes in queries/second.

Thanks,
Markus

-----Original message-----
> From:Rick Leir <rleir@leirtech.com>
> Sent: Sunday 3rd September 2017 17:08
> To: solr-user@lucene.apache.org
> Subject: Re: Solr uses lots of shared memory!
> 
> Hi all
> Malloc has a lock while it is active in the heap. If there is more than one thread, and
malloc finds the lock in use, then it avoids waiting on the lock by creating a new 'arena'
to hold its heap. My understanding is that a process with multiple threads which are all active
users of malloc will eventually have an arena per thread. If you limit the number of arenas,
you may suffer delays waiting on locks. 
> 
> But this needs performance testing. My experience was with C++, not with a JVM. I would
be interested to know if setting MALLOC_ARENA_MAX=2 makes a difference to performance.
> Cheers -- Rick
> 
> On September 2, 2017 1:15:38 PM EDT, Kevin Risden <compuwizard123@gmail.com> wrote:
> >I haven't looked at reproducing this locally, but since it seems like
> >there haven't been any new ideas decided to share this in case it
> >helps:
> >
> >I noticed in Travis CI [1] they are adding the environment variable
> >MALLOC_ARENA_MAX=2 and so I googled what that configuration did. To my
> >surprise, I came across a stackoverflow post [2] about how glibc could
> >actually be the case and report memory differently. I then found a
> >Hadoop issue HADOOP-7154 [3] about setting this as well to reduce
> >virtual memory usage. I found some more cases where this has helped as
> >well [4], [5], and [6]
> >
> >[1]
> >https://docs.travis-ci.com/user/build-environment-updates/2017-09-06/#Added
> >[2]
> >https://stackoverflow.com/questions/10575342/what-would-cause-a-java-process-to-greatly-exceed-the-xmx-or-xss-limit
> >[3]
> >https://issues.apache.org/jira/browse/HADOOP-7154?focusedCommentId=14505792
> >[4] https://github.com/cloudfoundry/java-buildpack/issues/320
> >[5] https://devcenter.heroku.com/articles/tuning-glibc-memory-behavior
> >[6]
> >https://www.ibm.com/developerworks/community/blogs/kevgrig/entry/linux_glibc_2_10_rhel_6_malloc_may_show_excessive_virtual_memory_usage?lang=en
> >Kevin Risden
> >
> >
> >On Thu, Aug 24, 2017 at 10:19 AM, Markus Jelsma
> ><markus.jelsma@openindex.io> wrote:
> >> Hello Bernd,
> >>
> >> According to the man page, i should get a list of stuff in shared
> >memory if i invoke it with just a PID. Which shows a list of libraries
> >that together account for about 25 MB's shared memory usage. Accoring
> >to ps and top, the JVM uses 2800 MB shared memory (not virtual), that
> >leaves 2775 MB unaccounted for. Any ideas? Anyone else to reproduce it
> >on a freshly restarted node?
> >>
> >> Thanks,
> >> Markus
> >>
> >>
> >>   PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM    
TIME+
> >COMMAND
> >> 18901 markus    20   0 14,778g 4,965g 2,987g S 891,1 31,7  20:21.63
> >java
> >>
> >> 0x000055b9a17f1000      6K     
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
> >> 0x00007fdf1d314000      182K   
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libsunec.so
> >> 0x00007fdf1e548000      38K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libmanagement.so
> >> 0x00007fdf1e78e000      94K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnet.so
> >> 0x00007fdf1e9a6000      75K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libnio.so
> >> 0x00007fdf5cd6e000      34K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libzip.so
> >> 0x00007fdf5cf77000      46K    
> >/lib/x86_64-linux-gnu/libnss_files-2.24.so
> >> 0x00007fdf5d189000      46K    
> >/lib/x86_64-linux-gnu/libnss_nis-2.24.so
> >> 0x00007fdf5d395000      90K     /lib/x86_64-linux-gnu/libnsl-2.24.so
> >> 0x00007fdf5d5ae000      34K    
> >/lib/x86_64-linux-gnu/libnss_compat-2.24.so
> >> 0x00007fdf5d7b7000      187K   
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libjava.so
> >> 0x00007fdf5d9e6000      70K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/libverify.so
> >> 0x00007fdf5dbf8000      30K     /lib/x86_64-linux-gnu/librt-2.24.so
> >> 0x00007fdf5de00000      90K     /lib/x86_64-linux-gnu/libgcc_s.so.1
> >> 0x00007fdf5e017000      1063K   /lib/x86_64-linux-gnu/libm-2.24.so
> >> 0x00007fdf5e320000      1553K  
> >/usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.22
> >> 0x00007fdf5e6a8000      15936K 
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
> >> 0x00007fdf5f5ed000      139K   
> >/lib/x86_64-linux-gnu/libpthread-2.24.so
> >> 0x00007fdf5f80b000      14K     /lib/x86_64-linux-gnu/libdl-2.24.so
> >> 0x00007fdf5fa0f000      110K    /lib/x86_64-linux-gnu/libz.so.1.2.11
> >> 0x00007fdf5fc2b000      1813K   /lib/x86_64-linux-gnu/libc-2.24.so
> >> 0x00007fdf5fff2000      58K    
> >/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/jli/libjli.so
> >> 0x00007fdf60201000      158K    /lib/x86_64-linux-gnu/ld-2.24.so
> >>
> >> -----Original message-----
> >>> From:Bernd Fehling <bernd.fehling@uni-bielefeld.de>
> >>> Sent: Thursday 24th August 2017 15:39
> >>> To: solr-user@lucene.apache.org
> >>> Subject: Re: Solr uses lots of shared memory!
> >>>
> >>> Just an idea, how about taking a dump with jmap and using
> >>> MemoryAnalyzerTool to see what is going on?
> >>>
> >>> Regards
> >>> Bernd
> >>>
> >>>
> >>> Am 24.08.2017 um 11:49 schrieb Markus Jelsma:
> >>> > Hello Shalin,
> >>> >
> >>> > Yes, the main search index has DocValues on just a few fields,
> >they are used for facetting and function queries, we started using
> >DocValues when 6.0 was released. Most fields are content fields for
> >many languages. I don't think it is going to be DocValues because the
> >max shared memory consumption is reduced my searching on fields fewer
> >languages, and by disabling highlighting, both not using DocValues.
> >>> >
> >>> > But it tried the option regardless, and because i didn't know
> >about it. But it appears the option does exactly nothing. First is
> >without any configuration for preload, second is with preload=true,
> >third is preload=false
> >>> >
> >>> > 14220 markus    20   0 14,675g 1,508g  62800 S   1,0  9,6 

> >0:36.98 java
> >>> > 14803 markus    20   0 14,674g 1,537g  63248 S   0,0  9,8 

> >0:34.50 java
> >>> > 15324 markus    20   0 14,674g 1,409g  63152 S   0,0  9,0 

> >0:35.50 java
> >>> >
> >>> > Please correct my config is i am wrong:
> >>> >
> >>> >   <directoryFactory name="DirectoryFactory"
> >class="${solr.directoryFactory:solr.NRTCachingDirectoryFactory}">
> >>> >      <bool name="preload">false</bool>
> >>> >   </directoryFactory>
> >>> >
> >>> > NRTCachingDirectoryFactory implies MMapDirectory right?
> >>> >
> >>> > Thanks,
> >>> > Markus
> >>> >
> >>> > -----Original message-----
> >>> >> From:Shalin Shekhar Mangar <shalinmangar@gmail.com>
> >>> >> Sent: Thursday 24th August 2017 5:51
> >>> >> To: solr-user@lucene.apache.org
> >>> >> Subject: Re: Solr uses lots of shared memory!
> >>> >>
> >>> >> Very interesting. Do you have many DocValue fields? Have you
> >always
> >>> >> had them i.e. did you see this problem before you turned on
> >DocValues?
> >>> >> The DocValue fields are in a separate file and they will be
> >memory
> >>> >> mapped on demand. One thing you can experiment with is to use
> >>> >> preload=true option on the MMapDirectoryFactory which will mmap
> >all
> >>> >> index files on startup [1]. Once you do this, and if you still
> >notice
> >>> >> shared memory leakage then it may be a genuine memory leak that
> >we
> >>> >> should investigate.
> >>> >>
> >>> >> [1] -
> >http://lucene.apache.org/solr/guide/6_6/datadir-and-directoryfactory-in-solrconfig.html#DataDirandDirectoryFactoryinSolrConfig-SpecifyingtheDirectoryFactoryForYourIndex
> >>> >>
> >>> >> On Wed, Aug 23, 2017 at 7:02 PM, Markus Jelsma
> >>> >> <markus.jelsma@openindex.io> wrote:
> >>> >>> I do not think it is a problem of reporting after watching
top
> >after restart of some Solr instances, it dropped back to `normal`,
> >around 350 MB, which i think it high to but anyway.
> >>> >>>
> >>> >>> Two hours later, the restarted nodes are slowly increasing
> >shared memory consumption to about 1500 MB now. I don't understand why
> >shared memory usage should/would increase slowly over time, it makes
> >little sense to me and i cannot remember Solr doing this in the past
> >ten years.
> >>> >>>
> >>> >>> But it seems to correlate to index size on disk, these main
text
> >search nodes have an index of around 16 GB and up 3 GB of shared memory
> >after a few days. Logs nodes up to 800 MB index size and 320 MB of
> >shared memory, the low latency nodes have four different cores that
> >make up just over 100 MB index size, shared memory consumption is just
> >22 MB, which seems more reasonable for the case of shared memory.
> >>> >>>
> >>> >>> I can also force Solr to 'leak' shared memory just by sending
> >queries to it. My freshly restarted local node used 68 MB shared memory
> >at startup. Two minutes and 25.000 queries later it was already 2748
> >MB! At first there is a very sharp increase to 2000, then it takes
> >almost two minutes more to increase to 2748. I can decrease the maximum
> >shared memory usage to 1200 if i query (via edismax) only on fields of
> >one language instead of 25 orso. I can decrease it as well further if i
> >disable highlighting (HUH?) but still query on all fields.
> >>> >>>
> >>> >>> * We have tried patching Java's ByteBuffer [1] because it seemed
> >to fit the problems, it does not fix it.
> >>> >>> * We have also removed all our custom plugins, so it has become
> >a vanilla Solr 6.6 just with our stripped down schema and solrconfig,
> >it neither fixes it.
> >>> >>>
> >>> >>> Why does it slowly increase over time?
> >>> >>> Why does it appear to correlate to index size?
> >>> >>> Is anyone else seeing this on their 6.6 cloud production or
> >local machines?
> >>> >>>
> >>> >>> Thanks,
> >>> >>> Markus
> >>> >>>
> >>> >>> [1]: http://www.evanjones.ca/java-bytebuffer-leak.html
> >>> >>>
> >>> >>> -----Original message-----
> >>> >>>> From:Shawn Heisey <apache@elyograg.org>
> >>> >>>> Sent: Tuesday 22nd August 2017 17:32
> >>> >>>> To: solr-user@lucene.apache.org
> >>> >>>> Subject: Re: Solr uses lots of shared memory!
> >>> >>>>
> >>> >>>> On 8/22/2017 7:24 AM, Markus Jelsma wrote:
> >>> >>>>> I have never seen this before, one of our collections,
all
> >nodes eating tons of shared memory!
> >>> >>>>>
> >>> >>>>> Here's one of the nodes:
> >>> >>>>> 10497 solr      20   0 19.439g 4.505g 3.139g
S   1.0 57.8  
> >2511:46 java
> >>> >>>>>
> >>> >>>>> RSS is roughly equal to heap size + usual off-heap
space +
> >shared memory. Virtual is equal to RSS and index size on disk. For two
> >other collections, the nodes use shared memory as expected, in the MB
> >range.
> >>> >>>>>
> >>> >>>>> How can Solr, this collection, use so much shared memory?
Why?
> >>> >>>>
> >>> >>>> I've seen this on my own servers at work, and when I add
up a
> >subset of
> >>> >>>> the memory numbers I can see from the system, it ends up
being
> >more
> >>> >>>> memory than I even have in the server.
> >>> >>>>
> >>> >>>> I suspect there is something odd going on in how Java reports
> >memory
> >>> >>>> usage to the OS, or maybe a glitch in how Linux interprets
> >Java's memory
> >>> >>>> usage.  At some point in the past, numbers were reported
> >correctly.  I
> >>> >>>> do not know if the change came about because of a Solr
upgrade,
> >because
> >>> >>>> of a Java upgrade, or because of an OS kernel upgrade. 
All
> >three were
> >>> >>>> upgraded between when I know the numbers looked right and
when
> >I noticed
> >>> >>>> they were wrong.
> >>> >>>>
> >>> >>>>
> >https://www.dropbox.com/s/91uqlrnfghr2heo/solr-memory-sorted-top.png?dl=0
> >>> >>>>
> >>> >>>> This screenshot shows that Solr is using 17GB of memory,
> >41.45GB of
> >>> >>>> memory is being used by the OS disk cache, and 10.23GB
of
> >memory is
> >>> >>>> free.  Add those up, and it comes to 68.68GB ... but the
> >machine only
> >>> >>>> has 64GB of memory, and that total doesn't include the
memory
> >usage of
> >>> >>>> the other processes seen in the screenshot.  This impossible
> >situation
> >>> >>>> means that something is being misreported somewhere. 
If I
> >deduct that
> >>> >>>> 11GB of SHR from the RES value, then all the numbers work.
> >>> >>>>
> >>> >>>> The screenshot was almost 3 years ago, so I do not know
what
> >machine it
> >>> >>>> came from, and therefore I can't be sure what the actual
heap
> >size was.
> >>> >>>> I think it was about 6GB -- the difference between RES
and SHR.
> > I have
> >>> >>>> used a 6GB heap on some of my production servers in the
past. 
> >The
> >>> >>>> server where I got this screenshot was not having any
> >noticeable
> >>> >>>> performance or memory problems, so I think that I can trust
> >that the
> >>> >>>> main numbers above the process list (which only come from
the
> >OS) are
> >>> >>>> correct.
> >>> >>>>
> >>> >>>> Thanks,
> >>> >>>> Shawn
> >>> >>>>
> >>> >>>>
> >>> >>
> >>> >>
> >>> >>
> >>> >> --
> >>> >> Regards,
> >>> >> Shalin Shekhar Mangar.
> >>> >>
> >>>
> 
> -- 
> Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Mime
View raw message