Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 078A5FE3B for ; Sun, 31 Mar 2013 10:38:40 +0000 (UTC) Received: (qmail 20120 invoked by uid 500); 31 Mar 2013 10:38:35 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 20040 invoked by uid 500); 31 Mar 2013 10:38:35 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 19948 invoked by uid 99); 31 Mar 2013 10:38:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 Mar 2013 10:38:35 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [208.113.200.5] (HELO homiemail-a45.g.dreamhost.com) (208.113.200.5) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 31 Mar 2013 10:38:29 +0000 Received: from homiemail-a45.g.dreamhost.com (localhost [127.0.0.1]) by homiemail-a45.g.dreamhost.com (Postfix) with ESMTP id EEEDC480D5 for ; Sun, 31 Mar 2013 03:37:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=thelastpickle.com; h=from :content-type:message-id:mime-version:subject:date:references:to :in-reply-to; s=thelastpickle.com; bh=Z1LB9ovncIZoRyBMmOLE590P7b I=; b=Hkl7iBII96f1feIc5/4J6RguMJiSYrZzFAk0QW9WLUa4EYDCMcS5SsdkOP q/5yGyLufIoNsp5jRfwbFV9Z2AbEgnfMBUfWlBXy7BbZm1kJnDXD7rqGxZ+os0up iCe8u9Bxa1oN+sTi7iqBWccj2Xm9J0HayUue1SYCpIZ4rYPi8= Received: from [172.20.2.191] (unknown [115.112.62.228]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: aaron@thelastpickle.com) by homiemail-a45.g.dreamhost.com (Postfix) with ESMTPSA id 443FB480C1 for ; Sun, 31 Mar 2013 03:37:58 -0700 (PDT) From: aaron morton Content-Type: multipart/alternative; boundary="Apple-Mail=_4C4F9D8A-CFA8-4DDC-BD3A-E71709CD325D" Message-Id: <5DBA4D4C-ECF6-4F2A-8504-E623C0EC8D14@thelastpickle.com> Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: Timeseries data Date: Sun, 31 Mar 2013 15:42:47 +0530 References: <57C7C3CBDCB04F45A57AEC4CB21C0CCD1DB662A3@mbx024-e1-nj-6.exch024.domain.local> <1C586B4D-01A1-4ED8-B919-DB5A442A2919@thelastpickle.com> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1499) X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_4C4F9D8A-CFA8-4DDC-BD3A-E71709CD325D Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 > I think if you use Level compaction, the number of sstables you will = touch will be less because sstables in each level is non overlapping = except L0. You will want to do some testing because LCS uses extra IO to make those = guarantees. You will also want to look at the SSTable size with LCS if = you are going to have wide rows.=20 Cheers ----------------- Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 29/03/2013, at 12:18 PM, sankalp kohli = wrote: > I think if you use Level compaction, the number of sstables you will = touch will be less because sstables in each level is non overlapping = except L0. >=20 >=20 > On Wed, Mar 27, 2013 at 8:20 PM, aaron morton = wrote: > sstablekey can help you find which sstables your keys are in.=20 >=20 > But yes, a slice call will need to read from all sstables the row has = a fragment in. This is one reason we normally suggest partitioning time = series data by month or year or something sensible in your problem = domain.=20 >=20 > You will probably also want to use reversed comparators so you do not = have to use reversed in your query.=20 >=20 > Hope that helps.=20 >=20 > ----------------- > Aaron Morton > Freelance Cassandra Consultant > New Zealand >=20 > @aaronmorton > http://www.thelastpickle.com >=20 > On 28/03/2013, at 8:25 AM, Bryan Talbot = wrote: >=20 >> In the worst case, that is possible, but compaction strategies try to = minimize the number of SSTables that a row appears in so a row being in = ALL SStables is not likely for most cases. >>=20 >> -Bryan >>=20 >>=20 >>=20 >> On Wed, Mar 27, 2013 at 12:17 PM, Kanwar Sangha = wrote: >> Hi =E2=80=93 I have a query on Read with Cassandra. We are planning = to have dynamic column family and each column would be on based a = timeseries. >>=20 >> =20 >>=20 >> Inserting data =E2=80=94 key =3D> =E2=80=98xxxxxxx=E2=80=B2, = {column_name =3D> TimeUUID(now), :column_value =3D> =E2=80=98value=E2=80=99= }, {column_name =3D> TimeUUID(now), :column_value =3D> =E2=80=98value=E2=80= =99 },=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6.. >>=20 >> =20 >>=20 >> Now this key might be spread across multiple SSTables over a period = of days. When we do a READ query to fetch say a slice of data from this = row based on time X->Y , would it need to get data from ALL sstables ? >>=20 >> =20 >>=20 >> Thanks, >>=20 >> Kanwar >>=20 >> =20 >>=20 >>=20 >=20 >=20 --Apple-Mail=_4C4F9D8A-CFA8-4DDC-BD3A-E71709CD325D Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8
I think if you use Level = compaction, the number of sstables you will touch will be less because = sstables in each level is non overlapping except = L0.
You will want to do some = testing because LCS uses extra IO to make those guarantees. You will = also want to look at the SSTable size with LCS if you are going to have = wide rows. 

Cheers

http://www.thelastpickle.com

On 29/03/2013, at 12:18 PM, sankalp kohli <kohlisankalp@gmail.com> = wrote:

I think if you use Level compaction, the = number of sstables you will touch will be less because sstables in each = level is non overlapping except L0.


On Wed, Mar 27, 2013 at 8:20 PM, aaron morton <aaron@thelastpickle.com> = wrote:
sstablekey can help you find which = sstables your keys are in. 

But yes, a slice = call will need to read from all sstables the row has a fragment in. This = is one reason we normally suggest partitioning time series data by month = or year or something sensible in your problem domain. 

You will probably also want to use reversed = comparators so you do not have to use reversed in your = query. 

Hope that = helps. 

-----------------
Aaron Morton
Freelance = Cassandra Consultant
New = Zealand

@aaronmorton

On 28/03/2013, at 8:25 AM, Bryan Talbot <btalbot@aeriagames.com> = wrote:

In the worst = case, that is possible, but compaction strategies try to minimize the = number of SSTables that a row appears in so a row being in ALL SStables = is not likely for most cases.

-Bryan



On Wed, Mar 27, 2013 at 12:17 PM, Kanwar Sangha = <kanwar@mavenir.com> wrote:

Hi =E2=80=93 I have a = query on Read with Cassandra. We are planning to have dynamic column = family and each column would be on based a timeseries.

 

Inserting data =E2=80=94 key =3D> = =E2=80=98xxxxxxx=E2=80=B2, {column_name =3D> TimeUUID(now), = :column_value =3D> =E2=80=98value=E2=80=99 }, {column_name =3D> = TimeUUID(now), :column_value =3D> =E2=80=98value=E2=80=99 = },=E2=80=A6=E2=80=A6=E2=80=A6=E2=80=A6..

 

Now this key might be spread = across multiple SSTables over a period of days. When we do a READ query = to fetch say a slice of data from this row based on time X->Y , would = it need to get data from ALL sstables ?

 

Thanks,

Kanwar

 


=



= --Apple-Mail=_4C4F9D8A-CFA8-4DDC-BD3A-E71709CD325D--