incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From dawood abdullah <muhammed.daw...@gmail.com>
Subject Re: Versioning in cassandra
Date Tue, 03 Sep 2013 17:50:12 GMT
I have tried with both the options creating secondary index and also tried
adding parentid to primary key, but I am getting all the files with
parentid 'yyy', what I want is the latest version of file with the
combination of parentid, fileid. Say below are the records inserted in the
file table:

insert into file (id, parentid, version, contenttype, description, name)
values ('f1', 'd1', '2011-03-04', 'pdf', 'f1 file', 'file1');
insert into file (id, parentid, version, contenttype, description, name)
values ('f1', 'd1', '2011-03-05', 'pdf', 'f1 file', 'file1');
insert into file (id, parentid, version, contenttype, description, name)
values ('f2', 'd1', '2011-03-05', 'pdf', 'f1 file', 'file1');
insert into file (id, parentid, version, contenttype, description, name)
values ('f2', 'd1', '2011-03-06', 'pdf', 'f1 file', 'file1');

I want to write a query which returns me second and last record and not the
first and third record, because for the first and third record there exists
a latest version, for the combination of id and parentid.

I am confused If at all this is achievable, please suggest.

Dawood



On Tue, Sep 3, 2013 at 10:58 PM, Vivek Mishra <mishra.vivs@gmail.com> wrote:

> create secondary index over parentid.
> OR
> make it part of clustering key
>
> -Vivek
>
>
> On Tue, Sep 3, 2013 at 10:42 PM, dawood abdullah <
> muhammed.dawood@gmail.com> wrote:
>
>> Jan,
>>
>> The solution you gave works spot on, but there is one more requirement I
>> forgot to mention. Following is my table structure
>>
>> CREATE TABLE file (
>>   id text,
>>   contenttype text,
>>   createdby text,
>>   createdtime timestamp,
>>   description text,
>>   name text,
>>   parentid text,
>>   version timestamp,
>>   PRIMARY KEY (id, version)
>>
>> ) WITH CLUSTERING ORDER BY (version DESC);
>>
>>
>> The query (select * from file where id = 'xxx' limit 1;) provided solves
>> the problem of finding the latest version file. But I have one more
>> requirement of finding all the latest version files having parentid say
>> 'yyy'.
>>
>> Please suggest how can this query be achieved.
>>
>> Dawood
>>
>>
>>
>> On Tue, Sep 3, 2013 at 12:43 AM, dawood abdullah <
>> muhammed.dawood@gmail.com> wrote:
>>
>>> In my case version can be timestamp as well. What do you suggest version
>>> number to be, do you see any problems if I keep version as counter /
>>> timestamp ?
>>>
>>>
>>> On Tue, Sep 3, 2013 at 12:22 AM, Jan Algermissen <
>>> jan.algermissen@nordsc.com> wrote:
>>>
>>>>
>>>> On 02.09.2013, at 20:44, dawood abdullah <muhammed.dawood@gmail.com>
>>>> wrote:
>>>>
>>>> > Requirement is like I have a column family say File
>>>> >
>>>> > create table file(id text primary key, fname text, version int,
>>>> mimetype text, content text);
>>>> >
>>>> > Say, I have few records inserted, when I modify an existing record
>>>> (content is modified) a new version needs to be created. As I need to have
>>>> provision to revert to back any old version whenever required.
>>>> >
>>>>
>>>> So, can version be a timestamp? Or does it need to be an integer?
>>>>
>>>> In the former case, make use of C*'s ordering like so:
>>>>
>>>> CREATE TABLE file (
>>>>    file_id text,
>>>>    version timestamp,
>>>>    fname text,
>>>>    ....
>>>>    PRIMARY KEY (file_id,version)
>>>> ) WITH CLUSTERING ORDER BY (version DESC);
>>>>
>>>> Get the latest file version with
>>>>
>>>> select * from file where file_id = 'xxx' limit 1;
>>>>
>>>> If it has to be an integer, use counter columns.
>>>>
>>>> Jan
>>>>
>>>>
>>>> > Regards,
>>>> > Dawood
>>>> >
>>>> >
>>>> > On Mon, Sep 2, 2013 at 10:47 PM, Jan Algermissen <
>>>> jan.algermissen@nordsc.com> wrote:
>>>> > Hi Dawood,
>>>> >
>>>> > On 02.09.2013, at 16:36, dawood abdullah <muhammed.dawood@gmail.com>
>>>> wrote:
>>>> >
>>>> > > Hi
>>>> > > I have a requirement of versioning to be done in Cassandra.
>>>> > >
>>>> > > Following is my column family definition
>>>> > >
>>>> > > create table file_details(id text primary key, fname text, version
>>>> int, mimetype text);
>>>> > >
>>>> > > I have a secondary index created on fname column.
>>>> > >
>>>> > > Whenever I do an insert for the same 'fname', the version should
be
>>>> incremented. And when I retrieve a row with fname it should return me the
>>>> latest version row.
>>>> > >
>>>> > > Is there a better way to do in Cassandra? Please suggest what
>>>> approach needs to be taken.
>>>> >
>>>> > Can you explain more about your use case?
>>>> >
>>>> > If the version need not be a small number, but could be a timestamp,
>>>> you could make use of C*'s ordering feature , have the database set the new
>>>> version as a timestamp and retrieve the latest one with a simple LIMIT 1
>>>> query. (I'll explain more when this is an option for you).
>>>> >
>>>> > Jan
>>>> >
>>>> > P.S. Me being a REST/HTTP head, an alarm rings when I see 'version'
>>>> next to 'mimetype' :-) What exactly are you versioning here? Maybe we can
>>>> even change the situation from a functional POV?
>>>> >
>>>> >
>>>> > >
>>>> > > Regards,
>>>> > >
>>>> > > Dawood
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> >
>>>> >
>>>>
>>>>
>>>
>>
>

Mime
View raw message