Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (nike.apache.org: domain of graham@vast.com designates
 74.125.82.46 as permitted sender)
From: graham sanderson <graham@vast.com>
Content-Type: multipart/signed;
 boundary="Apple-Mail=_32275A3D-1DEE-4962-84D3-E8DE7C99CA8D";
 protocol="application/pkcs7-signature"; micalg=sha1
Message-Id: <DE334DA2-7A3D-4610-A798-85DEA93FCDDC@vast.com>
Mime-Version: 1.0 (Mac OS X Mail 7.3 \(1878.6\))
Subject: Re: Is per-table memory overhead due to SSTables or tables?
Date: Fri, 8 Aug 2014 19:14:42 -0500
References: 
 <CAAZU44m87C1yUFfz08nzVtkQNwW95YAw9bOsY_Ugu0fsWL7VsQ@mail.gmail.com>
To: user@cassandra.apache.org
In-Reply-To: 
 <CAAZU44m87C1yUFfz08nzVtkQNwW95YAw9bOsY_Ugu0fsWL7VsQ@mail.gmail.com>


--Apple-Mail=_32275A3D-1DEE-4962-84D3-E8DE7C99CA8D
Content-Type: multipart/alternative;
	boundary="Apple-Mail=_C116CC2C-755C-4CA3-9FD9-54A2F197B280"


--Apple-Mail=_C116CC2C-755C-4CA3-9FD9-54A2F197B280
Content-Transfer-Encoding: quoted-printable
Content-Type: text/plain;
	charset=windows-1252

See https://issues.apache.org/jira/browse/CASSANDRA-5935

2.1 has a radically different implementation that side steps this (with =
off heap memtables), but if you really want lots of tables now you can =
do so as a trade off against GC behavior.

The problem is not SSTables per se, but more potentially one memtable =
per CF (and with slab allocator that can/does cost 1M); I am not =
familiar enough with the code to know when you would have 1 memtable vs =
0 memtable for a CF that isn=92t currently actively used.

Note also https://issues.apache.org/jira/browse/CASSANDRA-6602 and =
friends; there is definitely a need for efficient discarding of old data =
in event streams.


On Aug 8, 2014, at 2:29 PM, Kevin Burton <burton@spinn3r.com> wrote:

> The "conventional wisdom" says that it's ideal to only use "in the low =
hundreds" in the number of tables with cassandra as each table can use =
1MB or so of heap.  So if you have 1000 tables you'd have 1GB of heap =
used (which is no fun).
>=20
> But is this an issue with the tables themselves or the SSTables?
>=20
> I think the root of this is the SSTables as all the arena overhead =
will be for the SSTables too and more SSTables means more overhead.
>=20
> So by adding more tables, you end up with more SSTables which means =
more heap memory.
>=20
> If I'm in correct then this means that Cassandra could benefit from =
table partitioning.  Whereby you put all values in a specific region to =
a specific set of tables.
>=20
> So if you were storing log data, you could store it in hourly, or =
daily partitions, but view the table as one logical unit.
>=20
> the benefit here is that you could easily just drop the oldest data.  =
So if you need to clean up data, you wouldn't have to drop the whole =
table, just a days worth of the data.=20
>=20
> And since that day is just one SSTable on disk, the drop would be =
easy.. no tombstones, just delete the whole SSTable.
>=20
>=20
>=20
> --=20
>=20
> Founder/CEO Spinn3r.com
> Location: San Francisco, CA
> blog: http://burtonator.wordpress.com
> =85 or check out my Google+ profile
>=20
>=20


--Apple-Mail=_C116CC2C-755C-4CA3-9FD9-54A2F197B280
Content-Transfer-Encoding: quoted-printable
Content-Type: text/html;
	charset=windows-1252

<html><head><meta http-equiv=3D"Content-Type" content=3D"text/html =
charset=3Dwindows-1252"></head><body style=3D"word-wrap: break-word; =
-webkit-nbsp-mode: space; -webkit-line-break: =
after-white-space;">See&nbsp;<a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-5935">https://issu=
es.apache.org/jira/browse/CASSANDRA-5935</a><div><br></div><div>2.1 has =
a radically different implementation that side steps this (with off heap =
memtables), but if you really want lots of tables now you can do so as a =
trade off against GC behavior.</div><div><br></div><div>The problem is =
not SSTables per se, but more potentially one memtable per CF (and with =
slab allocator that can/does cost 1M); I am not familiar enough with the =
code to know when you would have 1 memtable vs 0 memtable for a CF that =
isn=92t currently actively used.</div><div><br></div><div>Note =
also&nbsp;<a =
href=3D"https://issues.apache.org/jira/browse/CASSANDRA-6602">https://issu=
es.apache.org/jira/browse/CASSANDRA-6602</a>&nbsp;and friends; there is =
definitely a need for efficient discarding of old data in event =
streams.</div><div><br></div><div><br><div><div>On Aug 8, 2014, at 2:29 =
PM, Kevin Burton &lt;<a =
href=3D"mailto:burton@spinn3r.com">burton@spinn3r.com</a>&gt; =
wrote:</div><br class=3D"Apple-interchange-newline"><blockquote =
type=3D"cite"><div dir=3D"ltr">The "conventional wisdom" says that it's =
ideal to only use "in the low hundreds" in the number of tables with =
cassandra as each table can use 1MB or so of heap. &nbsp;So if you have =
1000 tables you'd have 1GB of heap used (which is no fun).<div>

<br></div><div>But is this an issue with the tables themselves or the =
SSTables?</div><div><br></div><div>I think the root of this is the =
SSTables as all the arena overhead will be for the SSTables too and more =
SSTables means more overhead.</div>

<div><br></div><div>So by adding more tables, you end up with more =
SSTables which means more heap memory.</div><div><br></div><div>If I'm =
in correct then this means that Cassandra could benefit from table =
partitioning. &nbsp;Whereby you put all values in a specific region to a =
specific set of tables.</div>

<div><br></div><div>So if you were storing log data, you could store it =
in hourly, or daily partitions, but view the table as one logical =
unit.</div><div><br></div><div>the benefit here is that you could easily =
just drop the oldest data. &nbsp;So if you need to clean up data, you =
wouldn't have to drop the whole table, just a days worth of the =
data.&nbsp;</div>

<div><br></div><div>And since that day is just one SSTable on disk, the =
drop would be easy.. no tombstones, just delete the whole =
SSTable.</div><div><br></div><div><br clear=3D"all"><div><br></div>-- =
<br><div dir=3D"ltr">

<div><div style=3D"margin: 0px 0px 12pt;"><br =
class=3D"webkit-block-placeholder"></div><div>Founder/CEO&nbsp;<a =
href=3D"http://spinn3r.com/" =
target=3D"_blank">Spinn3r.com</a><br></div><div>Location:&nbsp;<b>San =
Francisco, CA</b><br></div>

<div><font color=3D"#2c2c2c" face=3D"Helvetica, Arial, sans-serif"><span =
style=3D"line-height:19px">blog:<b>&nbsp;</b></span></font><a =
href=3D"http://burtonator.wordpress.com/" =
target=3D"_blank">http://burtonator.wordpress.com</a></div>
<div>
=85 or check out my <a =
href=3D"https://plus.google.com/102718274791889610666/posts" =
target=3D"_blank">Google+ profile</a></div><div><a =
href=3D"http://spinn3r.com/" target=3D"_blank"><img =
src=3D"http://spinn3r.com/images/spinn3r.jpg"></a></div><div><br =
class=3D"webkit-block-placeholder"></div></div></div>
</div></div>
</blockquote></div><br></div></body></html>=

--Apple-Mail=_C116CC2C-755C-4CA3-9FD9-54A2F197B280--

--Apple-Mail=_32275A3D-1DEE-4962-84D3-E8DE7C99CA8D
Content-Disposition: attachment;
	filename=smime.p7s
Content-Type: application/pkcs7-signature;
	name=smime.p7s
Content-Transfer-Encoding: base64

MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIICuzCCArcw
ggIgAgIBTDANBgkqhkiG9w0BAQUFADCBojELMAkGA1UEBhMCVVMxCzAJBgNVBAgTAk9SMREwDwYD
VQQHEwhQb3J0bGFuZDEWMBQGA1UEChMNT21uaS1FeHBsb3JlcjEWMBQGA1UECxMNSVQgRGVwYXJ0
bWVudDEbMBkGA1UEAxMSd3d3LmNvcm5lcmNhc2UuY29tMSYwJAYJKoZIhvcNAQkBFhdibG9ja291
dEBjb3JuZXJjYXNlLmNvbTAeFw0xMTA0MDYxNjE0MzFaFw0yMTA0MDMxNjE0MzFaMIGjMQswCQYD
VQQGEwJVUzETMBEGA1UECBMKQ2FsaWZvcm5pYTEWMBQGA1UEBxMNU2FuIEZyYW5jaXNjbzEWMBQG
A1UEChMNVmFzdC5jb20gSW5jLjEUMBIGA1UECxMLRW5naW5lZXJpbmcxGTAXBgNVBAMTEEdyYWhh
bSBTYW5kZXJzb24xHjAcBgkqhkiG9w0BCQEWD2dyYWhhbUB2YXN0LmNvbTCBnzANBgkqhkiG9w0B
AQEFAAOBjQAwgYkCgYEAm4K/W/0VdaOiS6tC1G8tSCAw989XCsJXxVPiny/hND6T0jVv4vP0JRiO
vNzH6uoINoKQfgUKa+GCqILdY7Jdx61/WKqxltFTu5D0H8sFFNIKgf9cd3yU6t2susKrxaDXRCul
pmcJ3AFg4xuG3ZUZt+XTYhBebQfjwgGQh3/pkQUCAwEAATANBgkqhkiG9w0BAQUFAAOBgQCKW+hQ
JqNkPRht5fl8FHku80BLAH9ezEJtZJ6EU9fcK9jNPkAJgSEgPXQ++jE+4iYI2nIb/h5RILUxd1Ht
m/yZkNRUVCg0+0Qj6aMT/hfOT0kdP8/9OnbmIp2T6qvNN2rAGU58tt3cbuT2j3LMTS2VOGykK4He
iNYYqr+K6sPDHTGCAy0wggMpAgEBMIGpMIGiMQswCQYDVQQGEwJVUzELMAkGA1UECBMCT1IxETAP
BgNVBAcTCFBvcnRsYW5kMRYwFAYDVQQKEw1PbW5pLUV4cGxvcmVyMRYwFAYDVQQLEw1JVCBEZXBh
cnRtZW50MRswGQYDVQQDExJ3d3cuY29ybmVyY2FzZS5jb20xJjAkBgkqhkiG9w0BCQEWF2Jsb2Nr
b3V0QGNvcm5lcmNhc2UuY29tAgIBTDAJBgUrDgMCGgUAoIIB2TAYBgkqhkiG9w0BCQMxCwYJKoZI
hvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNDA4MDkwMDE0NDNaMCMGCSqGSIb3DQEJBDEWBBSHpQiM
FnUCQxbqfaQ6oaENPif/sDCBugYJKwYBBAGCNxAEMYGsMIGpMIGiMQswCQYDVQQGEwJVUzELMAkG
A1UECBMCT1IxETAPBgNVBAcTCFBvcnRsYW5kMRYwFAYDVQQKEw1PbW5pLUV4cGxvcmVyMRYwFAYD
VQQLEw1JVCBEZXBhcnRtZW50MRswGQYDVQQDExJ3d3cuY29ybmVyY2FzZS5jb20xJjAkBgkqhkiG
9w0BCQEWF2Jsb2Nrb3V0QGNvcm5lcmNhc2UuY29tAgIBTDCBvAYLKoZIhvcNAQkQAgsxgayggakw
gaIxCzAJBgNVBAYTAlVTMQswCQYDVQQIEwJPUjERMA8GA1UEBxMIUG9ydGxhbmQxFjAUBgNVBAoT
DU9tbmktRXhwbG9yZXIxFjAUBgNVBAsTDUlUIERlcGFydG1lbnQxGzAZBgNVBAMTEnd3dy5jb3Ju
ZXJjYXNlLmNvbTEmMCQGCSqGSIb3DQEJARYXYmxvY2tvdXRAY29ybmVyY2FzZS5jb20CAgFMMA0G
CSqGSIb3DQEBAQUABIGAaEwIS0/E2g19DmTdHCZgCEm0RS9yFBEW2cHJWsrB5g30enp9CDWNZyum
gMLkUtYdycrxBKQFn+1FQONsbf9/PwgzNLBEhEW1j2nNTWiZjHSW8VBalFdvh6MtYDcr0X3qOMHR
xHjrvtEfBIe3eoqtouGry7tgUz4R8aIBlwlMxk4AAAAAAAA=

--Apple-Mail=_32275A3D-1DEE-4962-84D3-E8DE7C99CA8D--