cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cass savy <casss...@gmail.com>
Subject Re: Data tiered compaction and data model question
Date Thu, 19 Feb 2015 21:58:50 GMT
Any feedback on data tiered compaction? Have anybody used it?

On Thu, Feb 19, 2015 at 6:06 AM, Kai Wang <depend@gmail.com> wrote:

> What's the typical size of the data field? Unless it's very large, I don't
> think table 2 is a "very" wide row (10x20x60x24=288000 events/partition at
> worst). Plus you only need to store 30 days of data. The over data size is
> 288000x30=8,640,000 events. I am not even sure if you need C* depending on
> event size.
>
> On Thu, Feb 19, 2015 at 12:00 AM, cass savy <casssavy@gmail.com> wrote:
>
>> 10-20 per minute is the average. Worstcase can be 10x of avg.
>>
>> On Wed, Feb 18, 2015 at 4:49 PM, Mohammed Guller <mohammed@glassbeam.com>
>> wrote:
>>
>>>  What is the maximum number of events that you expect in a day? What is
>>> the worst-case scenario?
>>>
>>>
>>>
>>> Mohammed
>>>
>>>
>>>
>>> *From:* cass savy [mailto:casssavy@gmail.com]
>>> *Sent:* Wednesday, February 18, 2015 4:21 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Data tiered compaction and data model question
>>>
>>>
>>>
>>> We want to track events in log  Cf/table and should be able to query for
>>> events that occurred in range of mins or hours for given day. Multiple
>>> events can occur in a given minute.  Listed 2 table designs and leaning
>>> towards table 1 to avoid large wide row.  Please advice on
>>>
>>>
>>>
>>> *Table 1*: not very widerow, still be able to query for range of
>>> minutes for given day
>>>
>>> and/or given day and range of hours
>>>
>>> Create table *log_Event*
>>>
>>> (
>>>
>>>  event_day text,
>>>
>>>  event_hr int,
>>>
>>>  event_time timeuuid,
>>>
>>>  data text,
>>>
>>> PRIMARY KEY (* (event_day,event_hr),*event_time)
>>>
>>> )
>>>
>>> *Table 2: This will be very wide row*
>>>
>>>
>>>
>>> Create table *log_Event*
>>>
>>> ( event_day text,
>>>
>>>  event_time timeuuid,
>>>
>>>  data text,
>>>
>>> PRIMARY KEY (* event_day,*event_time)
>>>
>>> )
>>>
>>>
>>>
>>> *Datatiered compaction: recommended for time series data as per below
>>> doc. Our data will be kept only for 30 days. Hence thought of using this
>>> compaction strategy.*
>>>
>>> http://www.datastax.com/dev/blog/datetieredcompactionstrategy
>>>
>>> Create table 1 listed above with this compaction strategy. Added some
>>> rows and did manual flush.  I do not see any sstables created yet. Is that
>>> expected?
>>>
>>>  compaction={'max_sstable_age_days': '1', 'class':
>>> 'DateTieredCompactionStrategy'}
>>>
>>>
>>>
>>
>>
>

Mime
View raw message