From user-return-20487-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Sun Sep 4 14:04:39 2011 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 59BC67A1F for ; Sun, 4 Sep 2011 14:04:39 +0000 (UTC) Received: (qmail 54849 invoked by uid 500); 4 Sep 2011 14:04:36 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 54784 invoked by uid 500); 4 Sep 2011 14:04:36 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 54774 invoked by uid 99); 4 Sep 2011 14:04:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Sep 2011 14:04:35 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of awangseason@163.com designates 123.125.50.132 as permitted sender) Received: from [123.125.50.132] (HELO m50-132.163.com) (123.125.50.132) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 04 Sep 2011 14:04:28 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=Received:From:Mime-Version:Content-Type:Subject: Date:In-Reply-To:To:References:Message-Id; bh=GkTk3PdPW+P0Wk3fQG skCL78xhT+EtMBNropVbI2yYI=; b=CmPFMxvFPVO9omMEZqu2l21RDEzY6s3L4t Q+WCwwiyhLR9z1ABr6KI/o+EuEBgQVlabX5kOtRbjcextjAd0V9+pR7ooxEHy3i2 g41anOTLZWqnX1wFHfxkp6V4hBiXjdq9WkrGP4EwwOg2feNCXBHV2H1B2z1+NZxy Kf5THMjjI= Received: from [192.168.1.103] (unknown [120.204.145.21]) by smtp2 (Coremail) with SMTP id DNGowEBZ2Uc5hWNO+qbSAA--.631S3; Sun, 04 Sep 2011 22:03:47 +0800 (CST) From: Jenny Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: multipart/alternative; boundary=Apple-Mail-1--316989834 Subject: what did actually happen? Date: Sun, 4 Sep 2011 22:05:42 +0800 In-Reply-To: To: user@cassandra.apache.org References: Message-Id: X-Mailer: Apple Mail (2.1084) X-CM-TRANSID:DNGowEBZ2Uc5hWNO+qbSAA--.631S3 X-Coremail-Antispam: 1Uf129KBjvJXoW3GFy7Gw13ZFyrGF13uw1UGFg_yoW3JF1xpw 1fGasxZF1kG3W5Jr4qkr1DA34kZrn3uasrJryUC34xAw45C34Yvw1Fkw4j9Fy7trZxWw4U Jw13Xr18KrsxZFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07ULdbnUUUUU= X-CM-SenderInfo: xdzd0w5vhd20rq6rljoofrz/1tbiMR2wdUi4zw6YLgAAsu --Apple-Mail-1--316989834 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi: I created a new super column family and insert 1000 super columns into = it in a short time. Cassandra 0.8 was running on my laptop. When = inserting, the system was continuing flushing to files and compaction. I = use the default configuration. I just can't figure out why the size of = data file is like that. public Integer binary_memtable_throughput_in_mb =3D 256; =20 /* if the size of columns or super-columns are more than this, = indexing will kick in */ public Integer column_index_size_in_kb =3D 64; public Integer in_memory_compaction_limit_in_mb =3D 256; public Integer concurrent_compactors =3D = Runtime.getRuntime().availableProcessors(); public Integer compaction_throughput_mb_per_sec =3D 16; ///////The compacted file become bigger and bigger, is there any limit? = 256 mb? Why there are three files, not one file? When to stop = compaction? -rw-r--r-- 1 wy staff 0 9 4 21:07 0-g-29-Compacted -rw-r--r-- 2 wy staff 26712582 9 4 21:07 0-g-29-Data.db =20 -rw-r--r-- 2 wy staff 1936 9 4 21:07 0-g-29-Filter.db -rw-r--r-- 2 wy staff 14 9 4 21:07 0-g-29-Index.db -rw-r--r-- 2 wy staff 4276 9 4 21:07 0-g-29-Statistics.db -rw-r--r-- 1 wy staff 0 9 4 21:07 0-g-30-Compacted -rw-r--r-- 3 wy staff 1206800 9 4 21:07 0-g-30-Data.db -rw-r--r-- 3 wy staff 16 9 4 21:07 0-g-30-Filter.db -rw-r--r-- 3 wy staff 14 9 4 21:07 0-g-30-Index.db -rw-r--r-- 3 wy staff 4276 9 4 21:07 0-g-30-Statistics.db -rw-r--r-- 1 wy staff 0 9 4 21:07 0-g-31-Compacted -rw-r--r-- 3 wy staff 1238050 9 4 21:07 0-g-31-Data.db -rw-r--r-- 3 wy staff 16 9 4 21:07 0-g-31-Filter.db -rw-r--r-- 3 wy staff 14 9 4 21:07 0-g-31-Index.db -rw-r--r-- 3 wy staff 4276 9 4 21:07 0-g-31-Statistics.db //////under "backups" folder ///// these are original data files, with almost same size 1.2 mb. It = seems that the thread-hood size of data in memory flushed to disk is 1.2 = mb?? conf.memtable_total_space_in_mb =3D (int) = (Runtime.getRuntime().maxMemory() / (3 * 1048576)); if -Xmx256M , =20 conf.memtable_total_space_in_mb =3D 83mb. Then, how to figure out 1.2mb? -rw-r--r-- 1 wy staff 1221766 9 4 21:00 0-g-1-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:00 0-g-1-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:00 0-g-1-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:00 0-g-1-Statistics.db -rw-r--r-- 1 wy staff 1210346 9 4 21:02 0-g-10-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:02 0-g-10-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:02 0-g-10-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:02 0-g-10-Statistics.db -rw-r--r-- 1 wy staff 1225068 9 4 21:02 0-g-11-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:02 0-g-11-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:02 0-g-11-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:02 0-g-11-Statistics.db -rw-r--r-- 1 wy staff 1223328 9 4 21:02 0-g-12-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:02 0-g-12-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:02 0-g-12-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:02 0-g-12-Statistics.db -rw-r--r-- 1 wy staff 1211216 9 4 21:03 0-g-14-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:03 0-g-14-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:03 0-g-14-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:03 0-g-14-Statistics.db -rw-r--r-- 1 wy staff 1256036 9 4 21:03 0-g-15-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:03 0-g-15-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:03 0-g-15-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:03 0-g-15-Statistics.db -rw-r--r-- 1 wy staff 1201208 9 4 21:03 0-g-16-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:03 0-g-16-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:03 0-g-16-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:03 0-g-16-Statistics.db -rw-r--r-- 1 wy staff 1208896 9 4 21:04 0-g-18-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:04 0-g-18-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:04 0-g-18-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:04 0-g-18-Statistics.db -rw-r--r-- 1 wy staff 1251322 9 4 21:04 0-g-19-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:04 0-g-19-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:04 0-g-19-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:04 0-g-19-Statistics.db -rw-r--r-- 1 wy staff 1211504 9 4 21:00 0-g-2-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:00 0-g-2-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:00 0-g-2-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:00 0-g-2-Statistics.db -rw-r--r-- 1 wy staff 1249872 9 4 21:04 0-g-20-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:04 0-g-20-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:04 0-g-20-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:04 0-g-20-Statistics.db -rw-r--r-- 1 wy staff 1218034 9 4 21:05 0-g-22-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:05 0-g-22-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:05 0-g-22-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:05 0-g-22-Statistics.db -rw-r--r-- 1 wy staff 1244578 9 4 21:05 0-g-23-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:05 0-g-23-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:05 0-g-23-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:05 0-g-23-Statistics.db -rw-r--r-- 1 wy staff 1213320 9 4 21:05 0-g-24-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:05 0-g-24-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:05 0-g-24-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:05 0-g-24-Statistics.db -rw-r--r-- 1 wy staff 1196204 9 4 21:06 0-g-26-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:06 0-g-26-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:06 0-g-26-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:06 0-g-26-Statistics.db -rw-r--r-- 1 wy staff 1236890 9 4 21:06 0-g-27-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:06 0-g-27-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:06 0-g-27-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:06 0-g-27-Statistics.db -rw-r--r-- 1 wy staff 1174382 9 4 21:06 0-g-28-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:06 0-g-28-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:06 0-g-28-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:06 0-g-28-Statistics.db -rw-r--r-- 1 wy staff 1196784 9 4 21:01 0-g-3-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:01 0-g-3-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:01 0-g-3-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:01 0-g-3-Statistics.db -rw-r--r-- 3 wy staff 1206800 9 4 21:07 0-g-30-Data.db -rw-r--r-- 3 wy staff 16 9 4 21:07 0-g-30-Filter.db -rw-r--r-- 3 wy staff 14 9 4 21:07 0-g-30-Index.db -rw-r--r-- 3 wy staff 4276 9 4 21:07 0-g-30-Statistics.db -rw-r--r-- 3 wy staff 1238050 9 4 21:07 0-g-31-Data.db -rw-r--r-- 3 wy staff 16 9 4 21:07 0-g-31-Filter.db -rw-r--r-- 3 wy staff 14 9 4 21:07 0-g-31-Index.db -rw-r--r-- 3 wy staff 4276 9 4 21:07 0-g-31-Statistics.db -rw-r--r-- 1 wy staff 1195044 9 4 21:01 0-g-4-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:01 0-g-4-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:01 0-g-4-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:01 0-g-4-Statistics.db -rw-r--r-- 1 wy staff 1170538 9 4 21:01 0-g-6-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:01 0-g-6-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:01 0-g-6-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:01 0-g-6-Statistics.db -rw-r--r-- 1 wy staff 1199468 9 4 21:01 0-g-7-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:01 0-g-7-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:01 0-g-7-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:01 0-g-7-Statistics.db -rw-r--r-- 1 wy staff 1198532 9 4 21:02 0-g-8-Data.db -rw-r--r-- 1 wy staff 16 9 4 21:02 0-g-8-Filter.db -rw-r--r-- 1 wy staff 14 9 4 21:02 0-g-8-Index.db -rw-r--r-- 1 wy staff 4276 9 4 21:02 0-g-8-Statistics.db /////Is there any optimization for extremely large row size? Best Regards! Yi Wang(Jenny) --Apple-Mail-1--316989834 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii   =   public Integer = binary_memtable_throughput_in_mb =3D= 256;

    

    /* if the size of columns = or super-columns are more than this, indexing will kick in */
    = public Integer column_index_size_in_kb =3D 64;
    public Integer = in_memory_compaction_limit_in_mb =3D= 256;
    public Integer concurrent_compactors =3D = Runtime.getRuntime().availableProcessors();
    public Integer = compaction_throughput_mb_per_sec =3D= 16;
///////The compacted file become bigger and bigger, is = there any limit? 256 mb? Why there are three files, not one file? When = to stop compaction?
-rw-r--r--   1 wy  staff =         0  9  4 21:07 = 0-g-29-Compacted
-rw-r--r--   2 wy  staff  26712582 =  9  4 21:07 0-g-29-Data.db  
-rw-r--r--   2 wy =  staff      1936  9  4 21:07 = 0-g-29-Filter.db
-rw-r--r--   2 wy  staff     =    14  9  4 21:07 0-g-29-Index.db
-rw-r--r--   1 wy  staff =         0  9  4 21:07 = 0-g-30-Compacted
-rw-r--r--   3 wy  staff   1206800 =  9  4 21:07 0-g-30-Data.db
-rw-r--r--   3 wy  staff =        16  9  4 21:07 = 0-g-30-Filter.db
-rw-r--r--   3 wy  staff     =    14  9  4 21:07 0-g-30-Index.db
-rw-r--r--   1 wy  staff =         0  9  4 21:07 = 0-g-31-Compacted
-rw-r--r--   3 wy  staff   1238050 =  9  4 21:07 0-g-31-Data.db
-rw-r--r--   3 wy  staff =        16  9  4 21:07 = 0-g-31-Filter.db
-rw-r--r--   3 wy  staff     =    14  9  4 21:07 0-g-31-Index.db
//////under "backups" = folder
///// these are original data files, with almost same = size 1.2 mb. It seems that the thread-hood size of data in memory = flushed to disk is 1.2 mb??
        =         conf.memtable_total_space_in_mb =3D (int) (Runtime.getRuntime().maxMemory() / (3 * = 1048576));
if -Xmx256M ,   
conf.memtable_total_space_in_mb =3D 83mb. Then, how to = figure out 1.2mb?

-rw-r--r--   1 wy  staff =  1221766  9  4 21:00 0-g-1-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:00 0-g-1-Index.db
-rw-r--r--   1 wy  staff =  1210346  9  4 21:02 0-g-10-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:02 0-g-10-Index.db
-rw-r--r--   1 wy  staff =  1225068  9  4 21:02 0-g-11-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:02 0-g-11-Index.db
-rw-r--r--   1 wy  staff =  1223328  9  4 21:02 0-g-12-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:02 0-g-12-Index.db
-rw-r--r--   1 wy  staff =  1211216  9  4 21:03 0-g-14-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:03 0-g-14-Index.db
-rw-r--r--   1 wy  staff =  1256036  9  4 21:03 0-g-15-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:03 0-g-15-Index.db
-rw-r--r--   1 wy  staff =  1201208  9  4 21:03 0-g-16-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:03 0-g-16-Index.db
-rw-r--r--   1 wy  staff =  1208896  9  4 21:04 0-g-18-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:04 0-g-18-Index.db
-rw-r--r--   1 wy  staff =  1251322  9  4 21:04 0-g-19-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:04 0-g-19-Index.db
-rw-r--r--   1 wy  staff =  1211504  9  4 21:00 0-g-2-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:00 0-g-2-Index.db
-rw-r--r--   1 wy  staff =  1249872  9  4 21:04 0-g-20-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:04 0-g-20-Index.db
-rw-r--r--   1 wy  staff =  1218034  9  4 21:05 0-g-22-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:05 0-g-22-Index.db
-rw-r--r--   1 wy  staff =  1244578  9  4 21:05 0-g-23-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:05 0-g-23-Index.db
-rw-r--r--   1 wy  staff =  1213320  9  4 21:05 0-g-24-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:05 0-g-24-Index.db
-rw-r--r--   1 wy  staff =  1196204  9  4 21:06 0-g-26-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:06 0-g-26-Index.db
-rw-r--r--   1 wy  staff =  1236890  9  4 21:06 0-g-27-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:06 0-g-27-Index.db
-rw-r--r--   1 wy  staff =  1174382  9  4 21:06 0-g-28-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:06 0-g-28-Index.db
-rw-r--r--   1 wy  staff =  1196784  9  4 21:01 0-g-3-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:01 0-g-3-Index.db
-rw-r--r--   3 wy  staff =  1206800  9  4 21:07 0-g-30-Data.db
-rw-r--r--   3 wy  staff =       14  9  4 21:07 0-g-30-Index.db
-rw-r--r--   3 wy  staff =  1238050  9  4 21:07 0-g-31-Data.db
-rw-r--r--   3 wy  staff =       14  9  4 21:07 0-g-31-Index.db
-rw-r--r--   1 wy  staff =  1195044  9  4 21:01 0-g-4-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:01 0-g-4-Index.db
-rw-r--r--   1 wy  staff =  1170538  9  4 21:01 0-g-6-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:01 0-g-6-Index.db
-rw-r--r--   1 wy  staff =  1199468  9  4 21:01 0-g-7-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:01 0-g-7-Index.db
-rw-r--r--   1 wy  staff =  1198532  9  4 21:02 0-g-8-Data.db
-rw-r--r--   1 wy  staff =       14  9  4 21:02 0-g-8-Index.db

/////Is there any = optimization for extremely large row = size?

Best Regards!

Yi Wang(Jenny)

= --Apple-Mail-1--316989834--