From user-return-32458-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Wed Mar 6 17:32:45 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 89AB598F0 for ; Wed, 6 Mar 2013 17:32:45 +0000 (UTC) Received: (qmail 52975 invoked by uid 500); 6 Mar 2013 17:32:42 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 52947 invoked by uid 500); 6 Mar 2013 17:32:42 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 52939 invoked by uid 99); 6 Mar 2013 17:32:42 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Mar 2013 17:32:42 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [63.146.121.108] (HELO mail.venarc.com) (63.146.121.108) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 06 Mar 2013 17:32:35 +0000 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.venarc.com (Postfix) with ESMTP id E824E6F00002 for ; Wed, 6 Mar 2013 09:32:13 -0800 (PST) X-Virus-Scanned: amavisd-new at venarc.com Received: from mail.venarc.com ([127.0.0.1]) by localhost (mail.venarc.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ls43iggDBa7v for ; Wed, 6 Mar 2013 09:32:08 -0800 (PST) Received: from [192.168.1.2] (drew-home [108.60.62.58]) by mail.venarc.com (Postfix) with ESMTPSA id C4BF36F00001 for ; Wed, 6 Mar 2013 09:32:08 -0800 (PST) From: Drew Kutcharian Content-Type: multipart/alternative; boundary="Apple-Mail=_004BCAE0-7B69-4F95-B369-0A5B231C7ED4" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Cassandra instead of memcached Date: Wed, 6 Mar 2013 09:32:09 -0800 References: <1AF6A63F-43D8-4F48-B0CB-78DF9E94D382@venarc.com> <692C4AA7-7C01-4966-AFC2-902758E7BA37@instaclustr.com> <443E479E-8C3B-4846-AD76-4A66A50C4A14@venarc.com> <5E8DCE23-EC06-4EE2-BE10-22F146B634D7@venarc.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_004BCAE0-7B69-4F95-B369-0A5B231C7ED4 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 I think the dataset should fit in memory easily. The main purpose of = this would be as a store for an API rate limiting/accounting system. I = think ebay guys are using C* too for the same reason. Initially we were = thinking of using Hazelcast or memcahed. But Hazelcast (at least the = community edition) has Java gc issues with big heaps and the problem = with memcached is lack of a reliable distribution (you lose a node, you = need to rehash everything), so I figured why not just use C*. =20 On Mar 6, 2013, at 9:08 AM, Edward Capriolo = wrote: > If your writing much more data then RAM cassandra will not work as = fast as memcache. Cassandra is not magical, if all of your data fits in = memory it is going to be fast, if most of your data fits in memory it = can still be fast. However if you plan on having much more data then = disk you need to think about more RAM and OR SSD disks. >=20 > We do not use c* as an "in-memory store". However for many of our = datasets we do not have a separate caching tier. In those cases = cassandra is both our "database" and our "in-memory store" if you want = to use those terms :) >=20 > On Wed, Mar 6, 2013 at 12:02 PM, Drew Kutcharian = wrote: > Thanks guys, this is what I was looking for. >=20 > @Edward. I definitely like crazy ideas ;), I think the only issue here = is that C* is a disk space hug, so not sure if that would be feasible = since free RAM is not as abundant as disk. BTW, I watched your = presentation, are you guys still using C* as in-memory store? >=20 >=20 >=20 >=20 > On Mar 6, 2013, at 7:44 AM, Edward Capriolo = wrote: >=20 >> http://www.slideshare.net/edwardcapriolo/cassandra-as-memcache >>=20 >> Read at ONE. >> READ_REPAIR_CHANCE as low as possible. >>=20 >> Use short TTL and short GC_GRACE. >>=20 >> Make the in memory memtable size as high as possible to avoid = flushing and compacting. >>=20 >> Optionally turn off commit log. >>=20 >> You can use cassandra like memcache but it is not a memcache = replacement. Cassandra persists writes and compacts SSTables, memcache = only has to keep data in memory. >>=20 >> If you want to try a crazy idea. try putting your persistent data on = a ram disk! Not data/system however! >>=20 >>=20 >>=20 >>=20 >>=20 >>=20 >> On Wed, Mar 6, 2013 at 2:45 AM, aaron morton = wrote: >> consider disabling durable_writes in the KS config to remove writing = to the commit log. That will speed things up for you. Note that you risk = losing data is cassandra crashes or is not shut down with nodetool = drain.=20 >>=20 >> Even if you set the gc_grace to 0, deletes will still need to be = committed to disk.=20 >>=20 >> Cheers >>=20 >> ----------------- >> Aaron Morton >> Freelance Cassandra Developer >> New Zealand >>=20 >> @aaronmorton >> http://www.thelastpickle.com >>=20 >> On 5/03/2013, at 9:51 AM, Drew Kutcharian wrote: >>=20 >>> Thanks Ben, that article was actually the reason I started thinking = about removing memcached. >>>=20 >>> I wanted to see what would be the optimum config to use C* as an = in-memory store. >>>=20 >>> -- Drew >>>=20 >>>=20 >>> On Mar 5, 2013, at 2:39 AM, Ben Bromhead = wrote: >>>=20 >>>> Check out = http://techblog.netflix.com/2012/07/benchmarking-high-performance-io-with.= html >>>>=20 >>>> Netflix used Cassandra with SSDs and were able to drop their = memcache layer. Mind you they were not using it purely as an in memory = KV store. >>>>=20 >>>> Ben >>>> Instaclustr | www.instaclustr.com | @instaclustr >>>>=20 >>>>=20 >>>>=20 >>>> On 05/03/2013, at 4:33 PM, Drew Kutcharian wrote: >>>>=20 >>>>> Hi Guys, >>>>>=20 >>>>> I'm thinking about using Cassandra as an in-memory key/value store = instead of memcached for a new project (just to get rid of a dependency = if possible). I was thinking about setting the replication factor to 1, = enabling off-heap row-cache and setting gc_grace_period to zero for the = CF that will be used for the key/value store. >>>>>=20 >>>>> Has anyone tried this? Any comments? >>>>>=20 >>>>> Thanks, >>>>>=20 >>>>> Drew >>>>>=20 >>>>>=20 >>>>=20 >>>=20 >>=20 >>=20 >=20 >=20 --Apple-Mail=_004BCAE0-7B69-4F95-B369-0A5B231C7ED4 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=iso-8859-1 I = think the dataset should fit in memory easily. The main purpose of this = would be as a store for an API rate limiting/accounting system. I think = ebay guys are using C* too for the same reason. Initially we were = thinking of using Hazelcast or memcahed. But Hazelcast (at least the = community edition) has Java gc issues with big heaps and the problem = with memcached is lack of a reliable distribution (you lose a node, you = need to rehash everything), so I figured why not just use = C*.
 


On Mar 6, 2013, at 9:08 = AM, Edward Capriolo <edlinuxguru@gmail.com> = wrote:

If your writing much more data then RAM cassandra = will not work as fast as memcache. Cassandra is not magical, if all of = your data fits in memory it is going to be fast, if most of your data = fits in memory it can still be fast. However if you plan on having much = more data then disk you need to think about more RAM and OR SSD = disks.

We do not use c* as an "in-memory store". = However for many of our datasets we do not have a separate caching tier. = In those cases cassandra is both our "database" and our "in-memory = store" if you want to use those terms :)

On Wed, Mar 6, 2013 at 12:02 PM, Drew = Kutcharian <drew@venarc.com> wrote:
Thanks guys, this is what I was = looking for.

@Edward. I definitely like crazy ideas = ;), I think the only issue here is that C* is a disk space hug, so not = sure if that would be feasible since free RAM is not as abundant as = disk. BTW, I watched your presentation, are you guys still using C* as = in-memory store?




On Mar = 6, 2013, at 7:44 AM, Edward Capriolo <edlinuxguru@gmail.com> = wrote:


Read at = ONE.
READ_REPAIR_CHANCE as low as possible.

Use short TTL and short = GC_GRACE.

Make the in memory memtable size as = high as possible to avoid flushing and = compacting.

Optionally turn off commit = log.

You can use cassandra like memcache but it is not a = memcache replacement. Cassandra persists writes and compacts SSTables, = memcache only has to keep data in memory.

If = you want to try a crazy idea. try putting your persistent data on a ram = disk! Not data/system however!
=






On Wed, Mar 6, 2013 at 2:45 AM, aaron = morton <aaron@thelastpickle.com> wrote:
consider disabling durable_writes in the = KS config to remove writing to the commit log. That will speed things up = for you. Note that you risk losing data is cassandra crashes or is not = shut down with nodetool drain. 

Even if you set the gc_grace to 0, deletes will still = need to be committed to = disk. 

Cheers

=
-----------------
Aaron Morton
Freelance = Cassandra Developer
New = Zealand

@aaronmorton

On 5/03/2013, at 9:51 AM, Drew Kutcharian <drew@venarc.com> = wrote:

Thanks Ben, that article was actually the = reason I started thinking about removing memcached.

I wanted to see what would be the optimum config to use = C* as an in-memory store.

-- = Drew


On Mar 5, 2013, at 2:39 AM, = Ben Bromhead <ben@instaclustr.com> wrote:

Check = out http://techblog.netflix.com/2012/07/benchmarking-high-pe= rformance-io-with.html

Netflix used Cassandra with SSDs and were able to drop = their memcache layer. Mind you they were not using it purely as an in = memory KV store.

Ben



On 05/03/2013, at 4:33 PM, Drew Kutcharian <drew@venarc.com> = wrote:

Hi Guys,

I'm thinking = about using Cassandra as an in-memory key/value store instead of = memcached for a new project (just to get rid of a dependency if = possible). I was thinking about setting the replication factor to 1, = enabling off-heap row-cache and setting gc_grace_period to zero for the = CF that will be used for the key/value store.

Has anyone tried this? Any = comments?

Thanks,

Drew





<= /div>




= --Apple-Mail=_004BCAE0-7B69-4F95-B369-0A5B231C7ED4--