cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Naresh Yadav <nyadav....@gmail.com>
Subject Re: Help on Designing Cassandra table for my usecase
Date Thu, 09 Jan 2014 18:23:07 GMT
@thunder It will be write once 80% of time but there can be cases client
makes correction in data and then we need to overwrite that......

Thanks
Naresh


On Thu, Jan 9, 2014 at 11:49 PM, Naresh Yadav <nyadav.ait@gmail.com> wrote:

> @thunder thanks for guidance queries will be fired by application on this
> table when users login and browse the application and also through mobile
> apps through webservice. Response needs to be quick as user will be doing
> analysis over this data on the fly. Writes also needs to be fast as there
> is time limit we need to show this data to user everyday.
>
> Aggregation we can build in application outside cassandra. But we are not
> clear what table we should design in cassandra for the queries we
> need..Please give guidance on the possible design to handle dynamic tags
> indexing for queries..
>
> Thanks
> Naresh
>
>
>
> On Thu, Jan 9, 2014 at 9:41 PM, Thunder Stumpges <
> thunder.stumpges@gmail.com> wrote:
>
>> This sort of work sounds much more like a Hadoop/Hive/Pig type of
>> analysis.
>>
>> What are your latency requirements on queries? Are they ad-hoc or part of
>> an application? What is the case where you would need to change an existing
>> value? If it is write once, then Hadoop/Hive is great, if it changes
>> randomly, then not so much.
>>
>> Cassandra has limitations that it does not support aggregation, that must
>> be done by a client. In my experience it is really suited to quickly write
>> lots of data and look it back up in a "random io" type manner if you
>> already know the "key" you are looking for.
>>
>> If you have the high speed write and rewrite needs, but also the "full
>> data" analytical requirements, there are plugins for using C* as a backing
>> store for Pig/Hive. It is a little finicky to get working depending on all
>> your versions but does work fairly well in my limited experience.
>>
>> Perhaps with a little better understanding of your workload needs others
>> can chime in too. Good luck.
>>
>> -Thunder
>>
>>
>> > On Jan 9, 2014, at 5:15 AM, Naresh Yadav <nyadav.ait@gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > I have a use case with huge data which i am not able to design in
>> cassandra.
>> >
>> > Table name : MetricResult
>> >
>> > Sample Data :
>> >
>> > Metric=Sales, Time=Month,  Period=Jan-10, Tag=U.S.A, Tag=Pen,
>> Value=10
>> > Metric=Sales, Time=Month, Period=Jan-10, Tag=U.S.A, Tag=Pencil,
>>  Value=20
>> > Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pen,
>> Value=30
>> > Metric=Sales, Time=Month, Period=Feb-10, Tag=U.S.A, Tag=Pencil,
>>  Value=10
>> > Metric=Sales, Time=Month, Period=Feb-10, Tag=India,
>>  Value=90
>> > Metric=Sales, Time=Year, Period=2010,       Tag=U.S.A,
>>    Value=70
>> > Metric=Cost,  Time=Year, Period=2010,    Tag=CPU,
>> Value=8000
>> > Metric=Cost,  Time=Year,  Period=2010,    Tag=RAM,
>>  Value=4000
>> > Metric=Cost,  Time=Year  Period=2011,     Tag=CPU,
>> Value=9000
>> > Metric=Resource, Time=Week Period=Week1-2013,
>>  Value=100
>> >
>> > So in above case i have case of
>> >          TimeSeries data  i.e Time,Period column
>> >          Dynamic columns i.e Tag column
>> >          Indexing on dynamic columns i.e Tag column
>> >          Aggregations SUM, AVERAGE
>> >          Same value comes again for a Metric, Time, Period, Tag then
>> overwrite it
>> >
>> > Queries i need to support :
>> > --------------------------------------
>> > a)Give data for Metric=Sales AND Time=Month
>> >        O/P : 5 rows
>> > b)Give data for Metric=Sales AND Time=Month AND Period=Jan-10
>> >        O/P : 2 rows
>> > c)Give data for Metric=Sales AND Tag=U.S.A
>> >        O/P : 5 rows
>> > d)Give data for Metric=Sales AND Period=Jan-10 AND Tag=U.S.A AND Tag=Pen
>> >        O/P :1 row
>> >
>> >
>> > This table can have TB's of data and for a Metric,Period can have
>> millions of rows.
>> >
>> > Please give suggestion to design/model this table in Cassandra. If some
>> limitation in Cassandra then suggest best technology to handle this.
>> >
>> >
>> > Thanks
>> > Naresh
>>
>
>
>
>

Mime
View raw message