From user-return-35778-apmail-cassandra-user-archive=cassandra.apache.org@cassandra.apache.org Tue Aug 6 08:11:28 2013 Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D820F79B for ; Tue, 6 Aug 2013 08:11:28 +0000 (UTC) Received: (qmail 40940 invoked by uid 500); 6 Aug 2013 08:11:25 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 40930 invoked by uid 500); 6 Aug 2013 08:11:25 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 40909 invoked by uid 99); 6 Aug 2013 08:11:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 08:11:23 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW X-Spam-Check-By: apache.org Received-SPF: error (nike.apache.org: local policy) Received: from [209.85.220.50] (HELO mail-pa0-f50.google.com) (209.85.220.50) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 06 Aug 2013 08:11:17 +0000 Received: by mail-pa0-f50.google.com with SMTP id fb10so408025pad.9 for ; Tue, 06 Aug 2013 01:10:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-gm-message-state:from:content-type:message-id:mime-version :subject:date:references:to:in-reply-to; bh=PpfZu33YJMkomLRRLs4II566FLzvfMZskqR0BhOOxNI=; b=PhRJeqlqunGJS8PTQ97Kf0zMIODCTrcGQhTiE4EPqdh/BTcTb2cYBP1yzYXPQ5kvdt /e58cU2uewkbWM21HQo/N6u9+av8wV34+vForEkRLgT9Cdb1oEFIBB4mzQOwvBhYMzs4 JvZ5fF/Br6ylFEpOQes3j+lnXxJ0IpgKljaXu8IdLiRu8zijiz5+wX1Wgbq1ZImYWICU uOlQO2aMIJRrkxu9xGj+/FNxIptq2BXHecIsGSf41YKbJt3xR/vdKhN6/LSSrkPA7F1F lAY975B7KRZoXF9h9W0MTfVgGmWXchdBxwaw5PRzgtvOREtNW+k3OR0/MhBcn+HnIA8o 9sdQ== X-Gm-Message-State: ALoCoQniUjHg57TPmWgbT+XbHwEd+ePi5CzPw8e9/pUEOcfRkAO6yVuRvrO1KgvZDTNDVqb8jdUA X-Received: by 10.66.142.42 with SMTP id rt10mr1926070pab.1.1375776636380; Tue, 06 Aug 2013 01:10:36 -0700 (PDT) Received: from [172.16.1.7] ([203.86.207.101]) by mx.google.com with ESMTPSA id tr10sm476184pbc.22.2013.08.06.01.10.34 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Tue, 06 Aug 2013 01:10:35 -0700 (PDT) From: Aaron Morton Content-Type: multipart/alternative; boundary="Apple-Mail=_0CC9F531-7381-41B7-8C8B-728F3CFDBEE9" Message-Id: <55B99BF7-9A31-4A65-988F-98C721D2B2A0@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.5 \(1508\)) Subject: Re: Question about 'duplicate' columns Date: Tue, 6 Aug 2013 20:10:31 +1200 References: To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1508) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_0CC9F531-7381-41B7-8C8B-728F3CFDBEE9 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=iso-8859-1 Yes. If you overwrite much older data with new data both "versions" of = the column will remain on disk until compaction get's to work on both = fragments of the row. Cheers =20 ----------------- Aaron Morton Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 6/08/2013, at 6:48 PM, Franc Carter = wrote: >=20 > I've been thinking through some cases that I can see happening at some = point and thought I'd ask on the list to see if my understanding is = correct. >=20 > Say a bunch of columns have been loaded 'a long time ago', i.e long = enough in the past that they have been compacted. My understanding is = that if some these columns get reloaded then they are likely to sit in = additional sstables until the larger sstable is called up for = compaction, which might be a while. >=20 > The case that springs to mind is filling small gaps in data by doing = bulk loads around the gap to make sure that the gap is filled. >=20 > Have I understood correctly ? >=20 > thanks >=20 > --=20 > Franc Carter | Systems architect | Sirca Ltd > franc.carter@sirca.org.au | www.sirca.org.au > Tel: +61 2 8355 2514=20 > Level 4, 55 Harrington St, The Rocks NSW 2000 > PO Box H58, Australia Square, Sydney NSW 1215 >=20 --Apple-Mail=_0CC9F531-7381-41B7-8C8B-728F3CFDBEE9 Content-Transfer-Encoding: 7bit Content-Type: text/html; charset=iso-8859-1 Yes. If you overwrite much older data with new data both "versions" of the column will remain on disk until compaction get's to work on both fragments of the row.

Cheers
 
-----------------
Aaron Morton
Cassandra Consultant
New Zealand

@aaronmorton

On 6/08/2013, at 6:48 PM, Franc Carter <franc.carter@sirca.org.au> wrote:


I've been thinking through some cases that I can see happening at some point and thought I'd ask on the list to see if my understanding is correct.

Say a bunch of columns have been loaded 'a long time ago', i.e long enough in the past that they have been compacted. My understanding is that if some these columns get reloaded then they are likely to sit in additional sstables until the larger sstable is called up for compaction, which might be a while.

The case that springs to mind is filling small gaps in data by doing bulk loads around the gap to make sure that the gap is filled.

Have I understood correctly ?

thanks

--
Franc Carter | Systems architect | Sirca Ltd
Level 4, 55 Harrington St, The Rocks NSW 2000
PO Box H58, Australia Square, Sydney NSW 1215


--Apple-Mail=_0CC9F531-7381-41B7-8C8B-728F3CFDBEE9--