cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: Thrift to cql : mixed static and dynamic columns with secondary index
Date Thu, 16 Jul 2015 15:38:36 GMT
This schema is something that we're providing a better CQL conversion for
in 3.0.  The one column you defined will become a "static" column, meaning
there is only one copy of it per partition.  The schema will look something
like this:

CREATE TABLE ref_file (
    key text,
    folder text static,
    column1 text,
    value text,
    PRIMARY KEY (key, column1)
) WITH COMPACT STORAGE;

The "column1" column will hold your dynamic field names, and the "value"
column will hold your dynamic field values.

Unfortunately, we probably won't support indexing the static column in
3.0.0, but we should be able to support that pretty soon afterwards.  The
ticket for that is https://issues.apache.org/jira/browse/CASSANDRA-8103.

If you don't want to wait for 3.x, migrating to a table like this is
probably your best option:

CREATE TABLE ref_file (
    key text PRIMARY KEY,
    folder text,
    attributes map<text, text>
)

In this case, the attributes map would hold your dynamic fields.

On Thu, Jul 16, 2015 at 4:22 AM, Clement Honore <honore.c@gmail.com> wrote:

> Hi,
>
> I'm trying to migrate from Cassandra 1.1 and Hector to a more up-to-date
> stack like Cassandra 1.2+ and CQL3.
>
> I have read http://www.datastax.com/dev/blog/thrift-to-cql3
> <https://webmail.one.grp/owa/redir.aspx?C=d70889e7914440b0ad13875bf00770a8&URL=http%3a%2f%2fwww.datastax.com%2fdev%2fblog%2fthrift-to-cql3>
but
> my use case adds a complexity which seems not documented : I have a mixed
> column family with a secondary index.
>
> The column family has one explicitly declared column, which is indexed
> natively.
> In this column family, I'm also adding columns dynamically : some with
> predictive names, some with dynamic names.
>
> If I try to query this table in cql, I can access only the declared column
> (as stated in the documentation above).
>
> If I change the declaration by removing the explicitly declared column (as
> explained in the documentation above), I loose the secondary index on it.
>
> If I explicitly declare all the columns with an already known name
> (assuming I accept that I will get plenty of columns with a null value for
> the lines which don't have those attributes), I still can't manage columns
> with a dynamic name.
> And I can't declare a collection as my  comparator is UTF8Type.
>
> Should I migrate in a new table if I want to keep all the functionalities?
> This is really a solution I want to avoid.
>
> Here is an example representing my actual schema :
>
> I have a column family "REF_File" referencing my files.
> A file always has a "folder". The "folder" is indexed to easily find my
> files.
> A file may have some attributes like "name", "size", "mime ".
> A file may have some comments referenced by a column "COM_X" where "X" is
> the comment ID.
>
> Column family creation :
>
> Create column family REF_File with comparator=UTF8Type and
> default_validation_class=UTF8Type and key_validation_class=UTF8Type and
> column_metadata=[{column_name: folder, validation_class: UTF8Type,
> index_type: KEYS}];
>
> set REF_File['id1']['folder']=folder1;
> set REF_File['id1']['name']=file1;
> set REF_File['id1']['size']=1234;
> set REF_File['id1']['COM_1']='';
> set REF_File['id1']['COM_2']='';
> set REF_File['id2']['folder']=folder1;
> set REF_File['id2']['name']=file2;
> set REF_File['id2']['mime']='image/jpeg';
> set REF_File['id2']['COM_1']='';
>
> Requesting :
>
> [default@DUNE_metadonnees] list REF_File;
> Using default limit of 100 Using default cell limit of 100
> -------------------
> RowKey: id1
> => (name=COM_1, value=, timestamp=1437034903045000) => (name=COM_2,
> value=, timestamp=1437034911121000) => (name=folder, value=folder1,
> timestamp=1437034833452000) => (name=name, value=file1,
> timestamp=1437034851993000) => (name=size, value=1234,
> timestamp=1437034871356000)
> -------------------
> RowKey: id2
> => (name=COM_1, value=, timestamp=1437035169011000) => (name=folder,
> value=folder1, timestamp=1437035062080000) => (name=mime, value=image/jpeg,
> timestamp=1437035145227000) => (name=name, value=file2,
> timestamp=1437035073596000)
>
> Thanks for your help !
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message