Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6FF08D55C for ; Mon, 19 Nov 2012 21:01:26 +0000 (UTC) Received: (qmail 74272 invoked by uid 500); 19 Nov 2012 21:01:23 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 74245 invoked by uid 500); 19 Nov 2012 21:01:23 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 74237 invoked by uid 99); 19 Nov 2012 21:01:23 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 21:01:23 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a91.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 19 Nov 2012 21:01:18 +0000 Received: from homiemail-a91.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a91.g.dreamhost.com (Postfix) with ESMTP id C4ECEAE069 for ; Mon, 19 Nov 2012 13:00:53 -0800 (PST) Received: from [192.168.2.13] (unknown [116.90.132.105]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a91.g.dreamhost.com (Postfix) with ESMTPSA id 2811CAE059 for ; Mon, 19 Nov 2012 13:00:52 -0800 (PST) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_1C2F9926-8780-4EB5-9957-C9696AFB74B0" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: row cache re-fill very slow Date: Tue, 20 Nov 2012 10:00:56 +1300 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_1C2F9926-8780-4EB5-9957-C9696AFB74B0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > i was just wondering if anyone else is experiencing very slow ( ~ 3.5 = MB/sec ) re-fill of the row cache at start up. It was mentioned the other day. =20 What version are you on ?=20 Do you know how many rows were loaded ? When complete it will log a = message with the pattern=20 "completed loading (%d ms; %d keys) row cache for %s.%s" > How is the "saved row cache file" processed? In Version 1.1, after the SSTables have been opened the keys in the = saved row cache are read one at a time and the whole row read into = memory. This is a single threaded operation.=20 In 1.2 reading the saved cache is still single threaded, but reading the = rows goes through the read thread pool so is in parallel. In both cases I do not believe the cache is stored in token (or key) = order.=20 > ( Admittedly whatever is going on is still much more preferable to = starting with a cold row cache ) row_cache_keys_to_save in yaml may help you find a happy half way point.=20= Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 20/11/2012, at 3:17 AM, Andras Szerdahelyi = wrote: > Hey list, >=20 > i was just wondering if anyone else is experiencing very slow ( ~ 3.5 = MB/sec ) re-fill of the row cache at start up. We operate with a large = row cache ( 10-15GB currently ) and we already measure startup times in = hours :-) >=20 > How is the "saved row cache file" processed? Are the cached row keys = simply iterated over and their respective rows read from SSTables - = possibly creating random reads with small enough sstable files, if the = keys were not stored in a manner optimised for a quick re-fill ? - or = is there a smarter algorithm ( i.e. scan through one sstable at a time, = filter rows that should be in row cache ) at work and this operation is = purely disk i/o bound ? >=20 > ( Admittedly whatever is going on is still much more preferable to = starting with a cold row cache ) >=20 > thanks! > Andras >=20 >=20 >=20 > Andras Szerdahelyi > Solutions Architect, IgnitionOne | 1831 Diegem E.Mommaertslaan 20A > M: +32 493 05 50 88 | Skype: sandrew84 >=20 >=20 > >=20 >=20 --Apple-Mail=_1C2F9926-8780-4EB5-9957-C9696AFB74B0 Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii
i was = just wondering if anyone else is experiencing very slow ( ~ 3.5 MB/sec ) = re-fill of the row cache at start up.
It was = mentioned the other day.  

What version are you on = ? 
Do you know how many rows were loaded ? When complete = it will log a message with the = pattern 

"completed loading (%d ms; %d = keys) row cache for %s.%s"

How is the "saved row = cache file" processed?

In Version 1.1, = after the SSTables have been opened the keys in the saved row cache = are read one at a time and the whole row read into memory. This is = a single threaded operation. 

In 1.2 = reading the saved cache is still single threaded, but reading the rows = goes through the read thread pool so is in = parallel.

In both cases I do not believe the = cache is stored in token (or key) = order. 

( Admittedly whatever is going = on is still much more preferable to starting with a cold row cache = )
Cheers


http://www.thelastpickle.com

On 20/11/2012, at 3:17 AM, Andras Szerdahelyi <andras.szerdahelyi@igni= tionone.com> wrote:

Hey list,

i was just wondering if anyone else is experiencing very slow ( ~ = 3.5 MB/sec ) re-fill of the row cache at start up. We operate with a = large row cache ( 10-15GB currently ) and we already measure startup = times in hours :-)

How is the "saved row cache file" processed? Are the cached row = keys simply iterated over and their respective rows read from SSTables - = possibly creating random reads with small enough sstable files, if the = keys were not stored in a manner optimised for a quick re-fill ? -  or is there a smarter algorithm ( i.e. scan = through one sstable at a time, filter rows that should be in row cache ) =  at work and this operation is purely disk i/o bound ?

( Admittedly whatever is going on is still much more preferable to = starting with a cold row cache )

thanks!
Andras



Andras Szerdahelyi

Solutions = Architect, IgnitionOne | = 1831 Diegem E.Mommaertslaan 20A
M: +32 493 05 50 88 | Skype: = sandrew84





= --Apple-Mail=_1C2F9926-8780-4EB5-9957-C9696AFB74B0--