Return-Path: Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: (qmail 28029 invoked from network); 6 Jul 2010 20:51:45 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Jul 2010 20:51:45 -0000 Received: (qmail 3133 invoked by uid 500); 6 Jul 2010 20:51:45 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 3111 invoked by uid 500); 6 Jul 2010 20:51:45 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 3103 invoked by uid 99); 6 Jul 2010 20:51:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jul 2010 20:51:45 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Jul 2010 20:51:42 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o66Khowp021531 for ; Tue, 6 Jul 2010 20:43:51 GMT Message-ID: <24046962.222791278449030892.JavaMail.jira@thor> Date: Tue, 6 Jul 2010 16:43:50 -0400 (EDT) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Subject: [jira] Resolved: (CASSANDRA-1230) Memory use grows extremely fast with super column families In-Reply-To: <23900143.68011277505530351.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-1230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-1230. --------------------------------------- Fix Version/s: (was: 0.6.4) Resolution: Duplicate you're generating very large rows that you don't have enough memory to compact. This is fixed in CASSANDRA-16 for 0.7; until then, the answer is "don't do that." (Specifically, the row size limit in 0.6 is 2 GB or when you run out of memory, whichever comes first.) > Memory use grows extremely fast with super column families > ---------------------------------------------------------- > > Key: CASSANDRA-1230 > URL: https://issues.apache.org/jira/browse/CASSANDRA-1230 > Project: Cassandra > Issue Type: Bug > Affects Versions: 0.6 > Environment: Single node Ubuntu 10.04 64 bit, sun-java6 from partner repositories, using pycassa 0.3.0 to insert events. > Reporter: Heikki Toivonen > Priority: Critical > Attachments: supercolbug.py > > > I have a script that inserts about 1kB of key/values into 10k super columns each into 1k rows. Or at least I tried to. I noticed that Cassandra's memory usage went up so fast that I was only able to insert into a few dozen rows before my machine run out of memory. When I use regular column families Cassandra's memory usage seems pretty flat, so this seems an issue specifically with super columns. > test program is attached and copied below > {code} > #!/usr/bin/env python > # Program to demonstrate a use case where Cassandra memory usage grows > # without bounds using super column family: > # - 1 row 140 MB RES 1400 MB VIRT > # - 5 rows 532 1600 > # - 10 580 1632 > # - 20 801 1775 > # - 40 958 2047 > # ... > # > # Stopping Cassandra and restarting makes it jump immediately to the same > # virtual memory usage. Resident memory size seems to be about > # half of the state prior to stopping. > # > # _JAVA_OPTIONS: -Xms64m -Xmx1G > # Cassandra 0.6.2 with default storage-conf.xml on single node > # Ubuntu 10.04 64bit > # sun-java6 > # pycassa 0.3.0 > import uuid > import pycassa > def insert10k(cf, rowkey): > for i in xrange(10000): > cf.insert(rowkey, { > str(i): { > "abcdefghijklmnopqrstuvwxyz":'1234567890', > "bbcdefghijklmnopqrstuvwxyz":'1234567890', > "cbcdefghijklmnopqrstuvwxyz":'1234567890', > "dbcdefghijklmnopqrstuvwxyz":'1234567890', > "ebcdefghijklmnopqrstuvwxyz":'1234567890', > "fbcdefghijklmnopqrstuvwxyz":'1234567890', > "gbcdefghijklmnopqrstuvwxyz":'1234567890', > "hbcdefghijklmnopqrstuvwxyz":'1234567890', > "ibcdefghijklmnopqrstuvwxyz":'1234567890', > "jbcdefghijklmnopqrstuvwxyz":'1234567890', > "kbcdefghijklmnopqrstuvwxyz":'1234567890', > "lbcdefghijklmnopqrstuvwxyz":'1234567890', > "mbcdefghijklmnopqrstuvwxyz":'1234567890', > "nbcdefghijklmnopqrstuvwxyz":'1234567890', > "obcdefghijklmnopqrstuvwxyz":'1234567890', > "pbcdefghijklmnopqrstuvwxyz":'1234567890', > "qbcdefghijklmnopqrstuvwxyz":'1234567890', > "rbcdefghijklmnopqrstuvwxyz":'1234567890', > "sbcdefghijklmnopqrstuvwxyz":'1234567890', > "tbcdefghijklmnopqrstuvwxyz":'1234567890', > "ubcdefghijklmnopqrstuvwxyz":'1234567890', > "vbcdefghijklmnopqrstuvwxyz":'1234567890', > "wbcdefghijklmnopqrstuvwxyz":'1234567890', > "xbcdefghijklmnopqrstuvwxyz":'1234567890', > "ybcdefghijklmnopqrstuvwxyz":'1234567890', > "zbcdefghijklmnopqrstuvwxyz":'1234567890', > }, > }) > def super_column(): > client = pycassa.connect() > cf = pycassa.ColumnFamily(client, 'Keyspace1', 'Super1', super=True) > i = 0 > while i < 1000: > insert10k(cf, uuid.uuid4().hex) > print i, 'inserted 10k' > i += 1 > if __name__ == '__main__': > super_column() > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.