incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tristan Seligmann <mithra...@mithrandi.net>
Subject Re: BigTable-like Versioned Cells, Importing PostgreSQL Data
Date Sun, 29 Sep 2013 23:42:03 GMT
I saw nobody has responded to this so I thought I'd take a shot.

On Fri, Sep 20, 2013 at 6:13 AM, Keith Bogs <keith.a.boggs@gmail.com> wrote:

>
> Key          1379649588:body 1379649522:body 1379649123:title
> a.com/1.html "<html>"                        "A"
> a.com/2.html                 "<html>"        "B"
> b.com/1.html "<html>"        "<html>"        "C"
>
> But CQL doesn't seem to support this. (Yes, I've read
> http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows.)
> Once upon a time it seems Thrift and Supercolumns maybe would work?
>

I would envision a schema something like this:

CREATE TABLE fields (
    page TEXT,
    timestamp INT,
    field_name TEXT,
    field_value TEXT,
    PRIMARY KEY (page, timestamp, field_name)
);


> I'd want to efficiently iterate through the "history" of a particular row
> (in other words, read all the columns for a row) or efficiently iterate
> through all the latest values for the CF (not reading the entire row, just
> a column slice). In the previous example, I'd want to return the latest
> 'body' entries with timestamps for every page ("row"/"key") in the database.
>
> Some have talked of having two CFs, one for versioned data and one for
> current values?
>

I think this might be advisable, as slicing a single column out of every
row would not be that efficient; then again, it might not matter if you're
trying to retrieve every row in the entire database.
-- 
mithrandi, i Ainil en-Balandor, a faer Ambar

Mime
View raw message