Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E483C10233 for ; Thu, 9 Jan 2014 13:16:57 +0000 (UTC) Received: (qmail 16369 invoked by uid 500); 9 Jan 2014 13:16:50 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 16308 invoked by uid 500); 9 Jan 2014 13:16:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 16294 invoked by uid 99); 9 Jan 2014 13:16:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 13:16:41 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of nyadav.ait@gmail.com designates 209.85.220.41 as permitted sender) Received: from [209.85.220.41] (HELO mail-pa0-f41.google.com) (209.85.220.41) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jan 2014 13:16:36 +0000 Received: by mail-pa0-f41.google.com with SMTP id fb1so1730846pad.0 for ; Thu, 09 Jan 2014 05:16:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to:content-type; bh=MkNIrqUr6Q9y4M0hxcTHuGXkE/M//sl6xHLQFrXi/Qs=; b=xMNpVgIIPY6aW5xvDVl6VaCC6vc7TXAFwdrkq9eWRUGAilui2ZFqJQrUVNEaOALD5m dp5eh19mC8tjKcFqli+zBveOCxIn+LnBDBcRsqa0JOSxWMqUJEqArcf4KudXYhaj3U/H 69tJ9NuzJuPGlIAMZ7buKRNiSB1ESnmORPbIl+broe9YPgdDSapNHqnXyVg+r7H08b8b cbjTWz7BXhu71QlRabCHgy4EdpoJ4sb/yxWfgHcMCfShuRQkDjMUkQH1xZcrfUrCLFfz M+Tm108UOtAbqlylO7EyC/kJmoIu+WPXIJzAK/Lv1DXb1bIA9uOGipPLVijBwzn2SLQg WPKw== X-Received: by 10.68.168.162 with SMTP id zx2mr1478168pbb.74.1389273376074; Thu, 09 Jan 2014 05:16:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.68.104.165 with HTTP; Thu, 9 Jan 2014 05:15:55 -0800 (PST) From: Naresh Yadav Date: Thu, 9 Jan 2014 18:45:55 +0530 Message-ID: Subject: Help on Designing Cassandra table for my usecase To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7b874c8a13b36c04ef896a90 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b874c8a13b36c04ef896a90 Content-Type: text/plain; charset=ISO-8859-1 Hi all, I have a use case with huge data which i am not able to design in cassandra. Table name : MetricResult Sample Data : Metric=Sales, Time=Month, Period=Jan-10, Tag=U.S.A, Tag=Pen, Value=10 Metric=Sales, Time=Month, Period=Jan-10, Tag=U.S.A, Tag=Pencil, Value=20 Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pen, Value=30 Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pencil, Value=10 Metric=Sales, Time=Month, Period=Feb-10, Tag=India, Value=90 Metric=Sales, Time=Year, Period=2010, Tag=U.S.A, Value=70 Metric=Cost, Time=Year, Period=2010, Tag=CPU, Value=8000 Metric=Cost, Time=Year, Period=2010, Tag=RAM, Value=4000 Metric=Cost, Time=Year Period=2011, Tag=CPU, Value=9000 Metric=Resource, Time=Week Period=Week1-2013, Value=100 So in above case i have case of TimeSeries data i.e Time,Period column Dynamic columns i.e Tag column Indexing on dynamic columns i.e Tag column Aggregations SUM, AVERAGE Same value comes again for a Metric, Time, Period, Tag then overwrite it Queries i need to support : -------------------------------------- a)Give data for Metric=Sales AND Time=Month O/P : 5 rows b)Give data for Metric=Sales AND Time=Month AND Period=Jan-10 O/P : 2 rows c)Give data for Metric=Sales AND Tag=U.S.A O/P : 5 rows d)Give data for Metric=Sales AND Period=Jan-10 AND Tag=U.S.A AND Tag=Pen O/P :1 row This table can have TB's of data and for a Metric,Period can have millions of rows. Please give suggestion to design/model this table in Cassandra. If some limitation in Cassandra then suggest best technology to handle this. Thanks Naresh --047d7b874c8a13b36c04ef896a90 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi all,

I have a use case with= huge data which i am not able to design in cassandra.

Ta= ble name : MetricResult=A0=A0=A0=A0=A0

Sample Data :

<= div>Metric=3DSales, Time=3DMonth,=A0 Period=3DJan-10, Tag=3DU.S.A, Tag=3DPe= n,=A0=A0=A0=A0 Value=3D10
Metric=3DSales, Time=3DMonth, Period=3DJan-10, Tag=3DU.S.A, Tag=3DPencil,= =A0 Value=3D20
Metric=3DSales, Time=3DMonth, Period=3DFeb-10, Tag=3DU.S.= A, Tag=3DPen,=A0=A0=A0=A0 Value=3D30
Metric=3DSales, Time=3DM= onth, Period=3DFeb-10, Tag=3DU.S.A, Tag=3DPencil,=A0 Value=3D10
Metric=3DSales, Time=3DMonth, Period=3DFeb-10, Tag=3DIndia, =A0 =A0 =A0 =A0= =A0 =A0 =A0 =A0 =A0 =A0=A0 Value=3D90
Metric=3DSales, Time=3DYear, Peri= od=3D2010, =A0 =A0 =A0 Tag=3DU.S.A, =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 = Value=3D70
Metric=3DCost,=A0 Time=3DYear, Period=3D2010, =A0=A0 Tag=3DCP= U, =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Value=3D8000
Metric=3DCost,=A0 Time=3DYear,=A0 Period=3D2010,=A0=A0=A0 Tag=3DRAM, =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 Value=3D4000
Metric=3DCost,=A0 Time= =3DYear=A0 Period=3D2011, =A0=A0=A0 Tag=3DCPU, =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0=A0 =A0=A0 Value=3D9000
Metric=3DResource, Time=3DWeek Period=3DWeek1= -2013, =A0=A0 =A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0 Value=3D100

So in above case i have case of
=A0=A0=A0=A0=A0=A0=A0=A0= TimeSeries data=A0 i.e Time,Period column
=A0=A0=A0=A0=A0=A0=A0=A0 Dyna= mic columns i.e Tag column
=A0=A0=A0=A0=A0=A0=A0=A0 Indexing = on dynamic columns i.e Tag column
=A0=A0=A0=A0=A0=A0=A0=A0 Aggregations SUM, AVERAGE
=A0=A0=A0= =A0=A0=A0=A0=A0 Same value comes again for a Metric, Time, Period, Tag then= overwrite it

Queries i need to support :
-----------= ---------------------------
a)Give data for Metric=3DSales AND Time=3DMonth
= =A0=A0=A0=A0=A0=A0 O/P : 5 rows
b)Give data for Metric= =3DSales AND Time=3DMonth AND Period=3DJan-10
=A0=A0=A0=A0=A0=A0 O/= P : 2 rows
c)Give data for Metric=3DSales AND Tag=3DU.S.A
=A0=A0=A0=A0=A0=A0 O/P : 5 rows
d)Give data for Metric=3DSale= s AND Period=3DJan-10 AND Tag=3DU.S.A AND Tag=3DPen
=A0= =A0=A0=A0=A0=A0 O/P :1 row


This table can have = TB's of data and for a Metric,Period can have millions of rows.

Please give suggestion to design/model this table= in Cassandra. If some limitation in Cassandra then suggest best technology= to handle this.


Thanks
Naresh
--047d7b874c8a13b36c04ef896a90--