cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Simon Zhou (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-13063) Too many instances of BigVersion
Date Wed, 21 Dec 2016 18:39:58 GMT


Simon Zhou commented on CASSANDRA-13063:

I realized this issue when debugging CASSANDRA-13049. Having 70+ objects is not an issue.
However, we had over 1 million open files during bootstrapping in CASSANDRA-13049 and almost
all of them are *-data.db and *-index.db. Given that each sstable uses one Descriptor instance,
which holds a separate instance of BigVersion, the memory footprint is not trivial. For a
quick proof of concept, I tested with below code snippet within

    public static void main(String[] args) throws InterruptedException
        List<BigVersion> versions = new ArrayList<>();
        // Create half million objects to simulate my issue in CASSANDRA-13049.
        for (int i = 0; i < 500000; i++) {
            versions.add(new BigVersion("3.0.10"));

By using JConsole, I can see the heap usage stays around 32MB AFTER PERFORMING GC. Then if
I remove the "for" loop, the heap usage AFTER PERFORMING GC is ~9MB. That means, the BigVersion
objects still contribute to over 20MB memory.

[~aleksey.kasyanov], as mentioned by [~slebresne], I'm not going to make it enum, just using
something like below. Does it make sense to you?

    private static final ConcurrentHashMap<String, Version> versions = new ConcurrentHashMap<>();
    public Version getVersion(String version)
        assert version != null : "Version cannot be null";

        Version bigVersion = versions.get(version);
        if (bigVersion == null) {
            bigVersion = new BigVersion(version);
            versions.putIfAbsent(version, bigVersion);

        return versions.get(version);

> Too many instances of BigVersion
> --------------------------------
>                 Key: CASSANDRA-13063
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Simon Zhou
>            Assignee: Simon Zhou
>            Priority: Minor
> When debugging with Cassandra 3.0.10 I found 70+ BigVersion objects on a new node after
. This was from a cluster created by CMM and had very little data. Since we create a new instance
of BigVersion for each SSTable, that would create too many objects, eg, when bootstrapping
new node in a cluster with many sstables.
> Looks like sstables can actually share the same BigVersion instance as long as they has
same version. What we can do is to create a object cache and only create new object if not
> {code}
> ConcurrentHashMap<String, BigVersion> versions = new ConcurrentHashMap<>();
> {code}
> May not be a big deal but a minor improvement.  [~tjake] what do you think?

This message was sent by Atlassian JIRA

View raw message