Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E3D2D11178 for ; Wed, 4 Jun 2014 12:26:42 +0000 (UTC) Received: (qmail 55980 invoked by uid 500); 4 Jun 2014 12:26:40 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 55944 invoked by uid 500); 4 Jun 2014 12:26:40 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 55935 invoked by uid 99); 4 Jun 2014 12:26:40 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2014 12:26:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of Johan.Idren@dice.se designates 207.46.163.244 as permitted sender) Received: from [207.46.163.244] (HELO na01-by2-obe.outbound.protection.outlook.com) (207.46.163.244) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Jun 2014 12:26:36 +0000 Received: from DM2PR0701MB795.namprd07.prod.outlook.com (10.242.127.21) by DM2PR0701MB794.namprd07.prod.outlook.com (10.242.127.20) with Microsoft SMTP Server (TLS) id 15.0.954.9; Wed, 4 Jun 2014 12:26:09 +0000 Received: from DM2PR0701MB795.namprd07.prod.outlook.com ([10.242.127.21]) by DM2PR0701MB795.namprd07.prod.outlook.com ([10.242.127.21]) with mapi id 15.00.0954.000; Wed, 4 Jun 2014 12:26:09 +0000 From: =?iso-8859-1?Q?Idr=E9n=2C_Johan?= To: "user@cassandra.apache.org" Subject: RE: memtable mem usage off by 10? Thread-Topic: memtable mem usage off by 10? Thread-Index: AQHPf9nTM6QBZyGBek+e+cuGdDOK85tgtt+AgAAA2leAAAl4gIAAAMybgAAL8wCAAAeDY4AABggAgAADpe4= Date: Wed, 4 Jun 2014 12:26:08 +0000 Message-ID: <1401884768496.56712@dice.se> References: <1401875371141.25081@dice.se> <1401876241947.15837@dice.se> <1401879841309.12539@dice.se> <1401882919755.3901@dice.se>, In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [80.239.191.4] x-microsoft-antispam: BL:0;ACTION:Default;RISK:Low;SCL:0;SPMLVL:NotSpam;PCL:0;RULEID: x-forefront-prvs: 0232B30BBC x-forefront-antispam-report: SFV:NSPM;SFS:(428001)(24454002)(199002)(189002)(377454003)(36756003)(87936001)(81542001)(83322001)(81342001)(19580395003)(19580405001)(101416001)(93886002)(54356999)(99286001)(76176999)(77096999)(2656002)(15975445006)(50986999)(99396002)(92566001)(92726001)(86362001)(83072002)(85852003)(74502001)(31966008)(19625215002)(74482001)(74662001)(46102001)(16236675002)(4396001)(66066001)(79102001)(20776003)(64706001)(76482001)(21056001)(77982001)(80022001);DIR:OUT;SFP:;SCL:1;SRVR:DM2PR0701MB794;H:DM2PR0701MB795.namprd07.prod.outlook.com;FPR:;MLV:sfv;PTR:InfoNoRecords;A:1;MX:1;LANG:en; received-spf: None (: dice.se does not designate permitted sender hosts) authentication-results: spf=none (sender IP is ) smtp.mailfrom=Johan.Idren@dice.se; Content-Type: multipart/alternative; boundary="_000_140188476849656712dicese_" MIME-Version: 1.0 X-OriginatorOrg: dice.se X-Virus-Checked: Checked by ClamAV on apache.org --_000_140188476849656712dicese_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Oh, well ok that explains why I'm not seeing a flush at 750MB. Sorry, I was= going by the documentation. It claims that the property is around in 2.0. If we skip that, part of my reply still makes sense: Having memtable_total_size_in_mb set to 20480, memtables are flushed at a r= eported value of ~2GB. With a constant overhead of ~10x, as suggested, this would mean that it use= d 20GB, which is 2x the size of the heap. That shouldn't work. According to the OS, cassandra doesn't use more than ~= 11-12GB. ________________________________ From: Benedict Elliott Smith Sent: Wednesday, June 4, 2014 2:07 PM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? I'm confused: there is no flush_largest_memtables_at property in C* 2.0? On 4 June 2014 12:55, Idr=E9n, Johan > wrote: Ok, so the overhead is a constant modifier, right. The 3x I arrived at with the following assumptions: heap is 10GB Default memory for memtable usage is 1/4 of heap in c* 2.0 max memory used for memtables is 2,5GB (10/4) flush_largest_memtables_at is 0.75 flush largest memtables when memtables use 7,5GB (3/4 of heap, 3x of the de= fault) With an overhead of 10x, it makes sense that my memtable is flushed when th= e jmx data says it is at ~250MB, ie 2,5GB, ie 1/4 of the heap After I've set the memtable_total_size_in_mb to a value larger than 7,5GB, = it should still not go over 7,5GB on account of flush_largest_memtables_at,= 3/4 the heap So I would expect to see memtables flushed to disk after they're being repo= rtedly at around 750MB. Having memtable_total_size_in_mb set to 20480, memtables are flushed at a r= eported value of ~2GB. With a constant overhead, this would mean that it used 20GB, which is 2x th= e size of the heap, instead of 3/4 of the heap as it should be if flush_lar= gest_memtables_at was being respected. This shouldn't be possible. ________________________________ From: Benedict Elliott Smith > Sent: Wednesday, June 4, 2014 1:19 PM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? Unfortunately it looks like the heap utilisation of memtables was not expos= ed in earlier versions, because they only maintained an estimate. The overhead scales linearly with the amount of data in your memtables (ass= uming the size of each cell is approx. constant). flush_largest_memtables_at is an independent setting to memtable_total_spac= e_in_mb, and generally has little effect. Ordinarily sstable flushes are tr= iggered by hitting the memtable_total_space_in_mb limit. I'm afraid I don't= follow where your 3x comes from? On 4 June 2014 12:04, Idr=E9n, Johan > wrote: Aha, ok. Thanks. Trying to understand what my cluster is doing: cassandra.db.memtable_data_size only gets me the actual data but not the me= mtable heap memory usage. Is there a way to check for heap memory usage? I would expect to hit the flush_largest_memtables_at value, and this would = be what causes the memtable flush to sstable then? By default 0.75? Then I would expect the amount of memory to be used to be maximum ~3x of wh= at I was seeing when I hadn't set memtable_total_space_in_mb (1/4 by defaul= t, max 3/4 before a flush), instead of close to 10x (250mb vs 2gb). This is of course assuming that the overhead scales linearly with the amoun= t of data in my table, we're using one table with three cells in this case.= If it hardly increases at all, then I'll give up I guess :) At least until 2.1.0 comes out and I can compare. BR Johan ________________________________ From: Benedict Elliott Smith > Sent: Wednesday, June 4, 2014 12:33 PM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? These measurements tell you the amount of user data stored in the memtables= , not the amount of heap used to store it, so the same applies. On 4 June 2014 11:04, Idr=E9n, Johan > wrote: I'm not measuring memtable size by looking at the sstables on disk, no. I'm= looking through the JMX data. So I would believe (or hope) that I'm gettin= g relevant data. If I have a heap of 10GB and set the memtable usage to 20GB, I would expect= to hit other problems, but I'm not seeing memory usage over 10GB for the h= eap, and the machine (which has ~30gb of memory) is showing ~10GB free, wit= h ~12GB used by cassandra, the rest in caches. Reading 8k rows/s, writing 2k rows/s on a 3 node cluster. So it's not idlin= g. BR Johan ________________________________ From: Benedict Elliott Smith > Sent: Wednesday, June 4, 2014 11:56 AM To: user@cassandra.apache.org Subject: Re: memtable mem usage off by 10? If you are storing small values in your columns, the object overhead is ver= y substantial. So what is 400Mb on disk may well be 4Gb in memtables, so if= you are measuring the memtable size by the resulting sstable size, you are= not getting an accurate picture. This overhead has been reduced by about 9= 0% in the upcoming 2.1 release, through tickets 6271, 6689 and 6694. On 4 June 2014 10:49, Idr=E9n, Johan > wrote: Hi, I'm seeing some strange behavior of the memtables, both in 1.2.13 and 2.0.7= , basically it looks like it's using 10x less memory than it should based o= n the documentation and options. 10GB heap for both clusters. 1.2.x should use 1/3 of the heap for memtables, but it uses max ~300mb befo= re flushing 2.0.7, same but 1/4 and ~250mb In the 2.0.7 cluster I set the memtable_total_space_in_mb to 4096, which th= en allowed cassandra to use up to ~400mb for memtables... I'm now running with 20480 for memtable_total_space_in_mb and cassandra is = using ~2GB for memtables. Soo, off by 10 somewhere? Has anyone else seen this? Can't find a JIRA for = any bug connected to this. java 1.7.0_55, JNA 4.1.0 (for the 2.0 cluster) BR Johan --_000_140188476849656712dicese_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable

Oh, well ok that explains why I'm not seeing a flush at 750MB. Sorry, I = was going by the documentation. It claims that the property is around in 2.= 0.


If we skip that, part of my reply still makes sense:


Having memtable_total_size_in_mb set to 20480, memtables are flushed= at a reported value of ~2GB. 


With a constant overhead of ~10x, as suggested, this would mean that= it used 20GB, which is 2x the size of the heap.


That shouldn't work. According to the OS, cassandra doesn't use more= than ~11-12GB.



From: Benedict Elliott Smit= h <belliottsmith@datastax.com>
Sent: Wednesday, June 4, 2014 2:07 PM
To: user@cassandra.apache.org
Subject: Re: memtable mem usage off by 10?
 
I'm confused: there is no flush_largest_memtables_at prope= rty in C* 2.0?


On 4 June 2014 12:55, Idr=E9n, Johan <Johan= .Idren@dice.se> wrote:

Ok, so the overhead is a constant modifier, right.


The 3x I arrived at with the following assumptions:


hea= p is 10GB

Default memory for memtable usage is 1/4 of heap in c* 2.0

max memory used for memtables is 2,5GB = (10/4)

flush_largest_memtables_at is 0.75

flush largest memtables when memtables use 7,5GB (3/4 of heap, 3x o= f the default)


With an overhead of 10x, it makes sense that my memtable is flushed when= the jmx data says it is at ~250MB, ie 2,5GB, ie 1/4 of the heap


After I've set the memtable_total_size_in_mb to a value larger than 7,5G= B, it should still not go over 7,5GB on account of flush_largest_memtables_= at, 3/4 the heap


So I would expect to see memtables flushed to disk after they're being r= eportedly at around 750MB.


Having memtable_total_size_in_mb set to 20480, memtables are flushed at = a reported value of ~2GB. 


With a constant overhead, this would mean that it used 20GB, which is 2x= the size of the heap, instead of 3/4 of the heap as it should be if flush_= largest_memtables_at was being respected.


This shouldn't be possible.



From: Benedict Elliott Smith <belliottsmith@datastax.com= >
Sent: Wednesday, June 4, 2014 1:19 PM

To: u= ser@cassandra.apache.org
Subject: Re: memtable mem usage off by 10?
 
Unfortunately it looks like the heap utilisation of memtab= les was not exposed in earlier versions, because they only maintained an es= timate.

The overhead scales linearly with the amount of data in your memtables= (assuming the size of each cell is approx. constant). 

flush_largest_memtables_at is an independent setting to memtable_total= _space_in_mb, and generally has little effect. Ordinarily sstable flushes a= re triggered by hitting the memtable_total_space_in_mb limit. I'm afraid I = don't follow where your 3x comes from?


On 4 June 2014 12:04, Idr=E9n, Johan <Johan= .Idren@dice.se> wrote:

Aha, ok. Thanks.


Trying to understand what my cluster is doing:


cassandra.db.memtable_data_size only gets me the actual data but not the = memtable heap memory usage. Is there a way to check for heap memory usage?<= /span>


I would expect to hit the flush_largest_m= emtables_at value, and this would be what causes the memtable flush to= sstable then? By default 0.75?


Then I would expect the amount of memory = to be used to be maximum ~3x of what I was seeing when I hadn't set me= mtable_total_space_in_mb (1/4 by default, max 3/4 before a flush)= , instead of close to 10x (250mb vs 2gb).


This is of course assuming that the overhead scales linearly with the amoun= t of data in my table, we're using one table with three cells in this case.= If it hardly increases at all, then I'll give up I guess :)

At least until 2.1.0 comes out and I can = compare.


BR

Johan



From: Benedict Elliott Smith <belliottsmith@datastax.com>
Sent: Wednesday, June 4, 2014 12:33 PM

To: u= ser@cassandra.apache.org
Subject: Re: memtable mem usage off by 10?
 
These measurements tell you the amount of user data stored= in the memtables, not the amount of heap used to store it, so the same app= lies.


On 4 June 2014 11:04, Idr=E9n, Johan <Johan= .Idren@dice.se> wrote:

I'm not measuring memtable size by looking at the sstables on disk, no. = I'm looking through the JMX data. So I would believe (or hope) that I'm get= ting relevant data. 


If I have a heap of 10GB and set the memtable usage to 20GB, I would exp= ect to hit other problems, but I'm not seeing memory usage over 10GB for th= e heap, and the machine (which has ~30gb of memory) is showing ~10GB free, = with ~12GB used by cassandra, the rest in caches.


Reading 8k rows/s, writing 2k rows/s on a 3 node cluster. So it's not id= ling.


BR

Johan



From: Benedict Elliott Smith <belliottsmith@datastax.com= >
Sent: Wednesday, June 4, 2014 11:56 AM
To: u= ser@cassandra.apache.org
Subject: Re: memtable mem usage off by 10?
 
If you are storing small values in your columns, the objec= t overhead is very substantial. So what is 400Mb on disk may well be 4Gb in= memtables, so if you are measuring the memtable size by the resulting ssta= ble size, you are not getting an accurate picture. This overhead has been reduced by about 90% in the upcoming 2.1 r= elease, through tickets 6271, 6689 and 6694.


On 4 June 2014 10:49, Idr=E9n, Johan <Johan= .Idren@dice.se> wrote:

Hi,


I'm seeing some strange behavior of the memtables, both in 1.2.13 and 2.= 0.7, basically it looks like it's using 10x less memory than it should base= d on the documentation and options.


10GB heap for both clusters.

1.2.x should use 1/3 of the heap for memtables, but it uses max ~300mb b= efore flushing


2.0.7, same but 1/4 and ~250mb


In the 2.0.7 cluster I set the memtable_total_space_in_mb to 4096, = which then allowed cassandra to use up to ~400mb for memtables...


I'm now running with 20480 for memtable_total_space_in_mb and cassa= ndra is using ~2GB for memtables.


Soo, off by 10 somewhere? Has anyone else seen this? Can't find a JIRA f= or any bug connected to this.

java 1.7.0_55, JNA 4.1.0 (for the 2.0 cluster)


BR

Johan





--_000_140188476849656712dicese_--