Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 22 Apr 2014 21:32:24 +0000 (UTC)
From: "Benedict (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12551683.1334871582661.158645.1398202344374@arcas>
In-Reply-To: <JIRA.12551683.1334871582661@arcas>
References: <JIRA.12551683.1334871582661@arcas>
Subject: [jira] [Commented] (CASSANDRA-4175) Reduce memory, disk space, and
 cpu usage with a column name/id map
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-4175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977467#comment-13977467 ] 

Benedict commented on CASSANDRA-4175:
-------------------------------------

See also CASSANDRA-6917 - IMO the best solution to this problem is an enum data type, and then to convert all column names to that type.

> Reduce memory, disk space, and cpu usage with a column name/id map
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-4175
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4175
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Jonathan Ellis
>            Assignee: Jason Brown
>              Labels: performance
>             Fix For: 3.0
>
>
> We spend a lot of memory on column names, both transiently (during reads) and more permanently (in the row cache).  Compression mitigates this on disk but not on the heap.
> The overhead is significant for typical small column values, e.g., ints.
> Even though we intern once we get to the memtable, this affects writes too via very high allocation rates in the young generation, hence more GC activity.
> Now that CQL3 provides us some guarantees that column names must be defined before they are inserted, we could create a map of (say) 32-bit int column id, to names, and use that internally right up until we return a resultset to the client.


--
This message was sent by Atlassian JIRA
(v6.2#6252)