incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhu Han <schumi....@gmail.com>
Subject Re: [SOLVED] Very high memory utilization (not caused by mmap on sstables)
Date Tue, 21 Dec 2010 03:16:24 GMT
Can anybody recommend a stable enough JDK environment for 0.6.x branch on
ubuntu server?

Thank you!

best regards,
hanzhu


On Sun, Dec 19, 2010 at 10:29 AM, Zhu Han <schumi.han@gmail.com> wrote:

> The problem seems still like the C-heap of JVM, which leaks 70MB every day.
> Here is the summary:
>
> on 12/19: 00000000010c3000 178548K rw---    [ anon ]
> on 12/18: 00000000010c3000 110320K rw---    [ anon ]
> on 12/17: 00000000010c3000  39256K rw---    [ anon ]
>
> This should not be the JVM object heap, because the object heap size is
> fixed up per the below JVM settings. Here is the map of JVM object heap,
> which remains constant.
>
> 00000000010c3000  39256K rw---    [ anon ]
>
> I'll paste it to open-jdk mailist to seek for help.
>
> Zhu,
>> Couple of quick questions:
>>  How many threads are in your JVM?
>>
>
> There are hundreds of threads. Here is the settings of Cassandra:
> 1)  *<ConcurrentReads>8</ConcurrentReads>
>   <ConcurrentWrites>128</ConcurrentWrites>*
>
> The thread stack size on this server is 1MB. So I observe hundreds of
> single mmap segment as 1MB.
>
>  Can you also post the full commandline as well?
>>
> Sure. All of them are default settings.
>
> /usr/bin/java -ea -Xms1G -Xmx1G -XX:+UseParNewGC -XX:+UseConcMarkSweepGC
> -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1
> -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly
> -XX:+HeapDumpOnOutOfMemoryError -Dcom.sun.management.jmxremote.port=8080
> -Dcom.sun.management.jmxremote.ssl=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dstorage-config=bin/../conf -cp
> bin/../conf:bin/../build/classes:bin/../lib/antlr-3.1.3.jar:bin/../lib/apache-cassandra-0.6.8.jar:bin/../lib/clhm-production.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-collections-3.2.1.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/google-collections-1.0.jar:bin/../lib/hadoop-core-0.20.1.jar:bin/../lib/high-scale-lib.jar:bin/../lib/ivy-2.1.0.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-r917130.jar:bin/../lib/log4j-1.2.14.jar:bin/../lib/slf4j-api-1.5.8.jar:bin/../lib/slf4j-log4j12-1.5.8.jar
> org.apache.cassandra.thrift.CassandraDaemon
>
>
>>  Also, output of cat /proc/meminfo
>>
>
> This is an openvz based testing environment. So /proc/meminfo is not very
> helpful. Whatever, I paste it here.
>
>
> MemTotal:      9838380 kB
> MemFree:       4005900 kB
> Buffers:             0 kB
> Cached:              0 kB
> SwapCached:          0 kB
> Active:              0 kB
> Inactive:            0 kB
> HighTotal:           0 kB
> HighFree:            0 kB
> LowTotal:      9838380 kB
> LowFree:       4005900 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> Dirty:               0 kB
> Writeback:           0 kB
> AnonPages:           0 kB
> Mapped:              0 kB
> Slab:                0 kB
> PageTables:          0 kB
> NFS_Unstable:        0 kB
> Bounce:              0 kB
> CommitLimit:         0 kB
> Committed_AS:        0 kB
> VmallocTotal:        0 kB
> VmallocUsed:         0 kB
> VmallocChunk:        0 kB
> HugePages_Total:     0
> HugePages_Free:      0
> HugePages_Rsvd:      0
> Hugepagesize:     2048 kB
>
>
>> thanks,
>> Sri
>>
>> On Fri, Dec 17, 2010 at 7:15 PM, Zhu Han <schumi.han@gmail.com> wrote:
>>
>> > Seems like  the problem there after I upgrade to "OpenJDK Runtime
>> > Environment (IcedTea6 1.9.2)". So it is not related to the bug I
>> reported
>> > two days ago.
>> >
>> > Can somebody else share some info with us? What's the java environment
>> you
>> > used? Is it stable for long-lived cassandra instances?
>> >
>> > best regards,
>> > hanzhu
>> >
>> >
>> > On Thu, Dec 16, 2010 at 9:28 PM, Zhu Han <schumi.han@gmail.com> wrote:
>> >
>> > > I've tried it. But it does not work for me this afternoon.
>> > >
>> > > Thank you!
>> > >
>> > > best regards,
>> > > hanzhu
>> > >
>> > >
>> > >
>> > > On Thu, Dec 16, 2010 at 8:59 PM, Matthew Conway <matt@backupify.com
>> > >wrote:
>> > >
>> > >> Thanks for debugging this, I'm running into the same problem.
>> > >> BTW, if you can ssh into your nodes, you can use jconsole over ssh:
>> > >> http://simplygenius.com/2010/08/jconsole-via-socks-ssh-tunnel.html
>> > >>
>> > >> Matt
>> > >>
>> > >>
>> > >> On Dec 16, 2010, at Thu Dec 16, 2:39 AM, Zhu Han wrote:
>> > >>
>> > >> > Sorry for spam again. :-)
>> > >> >
>> > >> > I think I find the root cause. Here is a bug report[1] on memory
>> leak
>> > of
>> > >> > ParNewGC.  It is solved by OpenJDK 1.6.0_20(IcedTea6 1.9.2)[2].
>> > >> >
>> > >> > So the suggestion is: for who runs cassandra  of Ubuntu 10.04,
>> please
>> > >> > upgrade OpenJDK to the latest version.
>> > >> >
>> > >> > [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6824570
>> > >> > [2]
>> > http://blog.fuseyism.com/index.php/2010/09/10/icedtea6-19-released/
>> > >> >
>> > >> > best regards,
>> > >> > hanzhu
>> > >> >
>> > >> >
>> > >> > On Thu, Dec 16, 2010 at 3:10 PM, Zhu Han <schumi.han@gmail.com>
>> > wrote:
>> > >> >
>> > >> >> The test node is behind a firewall. So I took some time to
find a
>> way
>> > >> to
>> > >> >> get JMX diagnostic information from it.
>> > >> >>
>> > >> >> What's interesting is, both the HeapMemoryUsage and
>> > NonHeapMemoryUsage
>> > >> >> reported by JVM is quite reasonable.  So, it's a myth why
the JVM
>> > >> process
>> > >> >> maps such a big anonymous memory region...
>> > >> >>
>> > >> >> $ java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar -
>> > localhost:8080
>> > >> >> java.lang:type=Memory HeapMemoryUsage
>> > >> >> 12/16/2010 15:07:45 +0800 org.archive.jmx.Client HeapMemoryUsage:
>> > >> >> committed: 1065025536
>> > >> >> init: 1073741824
>> > >> >> max: 1065025536
>> > >> >> used: 18295328
>> > >> >>
>> > >> >> $java -Xmx128m -jar /tmp/cmdline-jmxclient-0.10.3.jar -
>> > localhost:8080
>> > >> >> java.lang:type=Memory NonHeapMemoryUsage
>> > >> >> 12/16/2010 15:01:51 +0800 org.archive.jmx.Client
>> NonHeapMemoryUsage:
>> > >> >> committed: 34308096
>> > >> >> init: 24313856
>> > >> >> max: 226492416
>> > >> >> used: 21475376
>> > >> >>
>> > >> >> If anybody is interested in it, I can provide more diagnostic
>> > >> information
>> > >> >> before I restart the instance.
>> > >> >>
>> > >> >> best regards,
>> > >> >> hanzhu
>> > >> >>
>> > >> >>
>> > >> >>
>> > >> >> On Thu, Dec 16, 2010 at 1:00 PM, Zhu Han <schumi.han@gmail.com>
>> > wrote:
>> > >> >>
>> > >> >>> After investigating it deeper,  I suspect it's native
memory leak
>> of
>> > >> JVM.
>> > >> >>> The large anonymous map on lower address space should
be the
>> native
>> > >> heap of
>> > >> >>> JVM,  but not java object heap.  Has anybody met it before?
>> > >> >>>
>> > >> >>> I'll try to upgrade the JVM tonight.
>> > >> >>>
>> > >> >>> best regards,
>> > >> >>> hanzhu
>> > >> >>>
>> > >> >>>
>> > >> >>>
>> > >> >>> On Thu, Dec 16, 2010 at 10:50 AM, Zhu Han <schumi.han@gmail.com>
>> > >> wrote:
>> > >> >>>
>> > >> >>>> Hi,
>> > >> >>>>
>> > >> >>>> I have a test node with apache-cassandra-0.6.8 on
ubuntu 10.4.
>>  The
>> > >> >>>> hardware environment is an OpenVZ container. JVM settings
is
>> > >> >>>> # java -Xmx128m -version
>> > >> >>>> java version "1.6.0_18"
>> > >> >>>> OpenJDK Runtime Environment (IcedTea6 1.8.2)
>> (6b18-1.8.2-4ubuntu2)
>> > >> >>>> OpenJDK 64-Bit Server VM (build 16.0-b13, mixed mode)
>> > >> >>>>
>> > >> >>>> This is the memory settings:
>> > >> >>>>
>> > >> >>>> "/usr/bin/java -ea -Xms1G -Xmx1G ..."
>> > >> >>>>
>> > >> >>>> And the ondisk footprint of sstables is very small:
>> > >> >>>>
>> > >> >>>> "#du -sh data/
>> > >> >>>> "9.8M    data/"
>> > >> >>>>
>> > >> >>>> The node was infrequently accessed in the last  three
weeks.
>>  After
>> > >> that,
>> > >> >>>> I observe the abnormal memory utilization by top:
>> > >> >>>>
>> > >> >>>>  PID USER      PR  NI  *VIRT*  *RES*  SHR S %CPU %MEM
   TIME+
>> > >> >>>> COMMAND
>> > >> >>>>
>> > >> >>>> 7836 root      15   0     *3300m* *2.4g*  13m S  
 0 26.0
>> 2:58.51
>> > >> >>>> java
>> > >> >>>>
>> > >> >>>> The jvm heap utilization is quite normal:
>> > >> >>>>
>> > >> >>>> #sudo jstat -gc -J"-Xmx128m" 7836
>> > >> >>>> S0C    S1C    S0U    S1U      *EC*       *EU*    
     *OC*
>> > >> >>>> *OU*            *PC           PU*          YGC  YGCT
 FGC
>>  FGCT
>> > >> >>>> GCT
>> > >> >>>> 8512.0 8512.0 372.8   0.0   *68160.0*   *5225.7* 
 *963392.0
>> > >> 508200.7
>> > >> >>>> 30604.0 18373.4*    480    3.979      2      0.005
   3.984
>> > >> >>>>
>> > >> >>>> And then I try "pmap" to see the native memory mapping.
*There
>> is
>> > two
>> > >> >>>> large anonymous mmap regions.*
>> > >> >>>>
>> > >> >>>> 00000000080dc000 1573568K rw---    [ anon ]
>> > >> >>>> 00002b2afc900000  1079180K rw---    [ anon ]
>> > >> >>>>
>> > >> >>>> The second one should be JVM heap.  What is the first
one?  Mmap
>> of
>> > >> >>>> sstable should never be anonymous mmap, but file based
mmap.
>>  *Is
>> > it
>> > >>  a
>> > >> >>>> native memory leak?  *Does cassandra allocate any
>> DirectByteBuffer?
>> > >> >>>>
>> > >> >>>> best regards,
>> > >> >>>> hanzhu
>> > >> >>>>
>> > >> >>>
>> > >> >>>
>> > >> >>
>> > >>
>> > >>
>> > >
>> >
>>
>
>

Mime
View raw message