cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From DuyHai Doan <doanduy...@gmail.com>
Subject Re: I don't understand paging through a table by primary key.
Date Fri, 30 May 2014 18:03:50 GMT
Hello Kevin

One possible data model:

CREATE TABLE myLog(
  day int //day format as YYYYMMdd,
  date timeuuid,
  log_message text,
  PRIMARY_KEY(day,date)
);

 For each day, you can query paging by date (timeuuid format). SELECT
log_message FROM myLog where day = 20140530 AND date>... LIMIT xxx;

 Of course, you need some client side code to move from one day to another.
If the log volume for one day is too huge and risks creating ultra wide
row, you can increase the partitioning resolution and take hour as
partition key. In this case you would have:

CREATE TABLE myLog(
  hour int //hour format as YYYYMMddHH,
  date timeuuid,
  log_message text,
  PRIMARY_KEY(hour,date)
);





On Fri, May 30, 2014 at 7:20 PM, Russell Bradberry <rbradberry@gmail.com>
wrote:

> Then the data model you chose is incorrect.  As Rob Coli mentioned, you
> can not page through partitions that are ordered unless you are using an
> ordered partitioner.  Your only option is to store the data differently.
>  When using Cassandra you have to remember to “model your queries, not your
> data”.  You can only page the entire table by using the TOKEN keyword, and
> this is not efficient.
>
>
>
> On May 30, 2014 at 1:17:37 PM, Kevin Burton (burton@spinn3r.com) wrote:
>
> The specific issue is I have a fairly large table, which is immutable, and
> I need to get it in a form where it can be downloaded, page by page, via an
> API.
>
> This would involve reading the whole table.
>
> I'd like to page through it by key order to efficiently read the rows to
> minimize random reads.
>
> It's slightly more complicated then that in that it's a log structured
> table… basically holding the equivalent of apache logs..  I need to read
> these out by time and give them to API callers.
>
>
> On Fri, May 30, 2014 at 12:21 AM, DuyHai Doan <doanduyhai@gmail.com>
> wrote:
>
>> Hello Kevin
>>
>>  Can you be more specific on the issue you're facing ? What is the table
>> design ? What kind of query are you doing ?
>>
>>  Regards
>>
>>
>> On Fri, May 30, 2014 at 7:10 AM, Kevin Burton <burton@spinn3r.com> wrote:
>>
>>> I'm trying to grok this but I can't figure it out in CQL world.
>>>
>>> I'd like to efficiently page through a table via primary key.
>>>
>>> This way I only involve one node at a time and the reads on disk are
>>> contiguous.
>>>
>>> I would have assumed it was a combination of > pk and order by but that
>>> doesn't seem to work.
>>>
>>> --
>>>
>>>  Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> Skype: *burtonator*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> <https://plus.google.com/102718274791889610666/posts>
>>>  <http://spinn3r.com>
>>>  War is peace. Freedom is slavery. Ignorance is strength. Corporations
>>> are people.
>>>
>>
>>
>
>
> --
>
>  Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> Skype: *burtonator*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
>  <http://spinn3r.com>
>  War is peace. Freedom is slavery. Ignorance is strength. Corporations are
> people.
>
>

Mime
View raw message