cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Lin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-4964) Return column metadata for dynamic columns
Date Wed, 14 Nov 2012 16:04:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13497181#comment-13497181
] 

Peter Lin commented on CASSANDRA-4964:
--------------------------------------

I read that page a few months back. I "believe" I have a rough understanding of how Cassandra
transposes wide rows into multiple rows as the page describes "That is how CQL3 allows access
to wide rows: by transposing one internal wide rows into multiple CQL3 rows, one per cell
of the wide row. This is however just a different way to view the same information."

For the sake better understanding, say I define column family using the first form.

create column family clicks
    with key_validation_class = UTF8Type
     and comparator = DateType
     and default_validation_class = UTF8Type

When I insert dynamic columns with CQL, the name has to be a datetype and the value has to
be string. Say I also insert dynamic columns with Thrift and I have String column name and
BigDecimal for the value. Obviously that doesn't conform to date/utf8, but there's nothing
stopping me from doing that with thrift. For me, that flexibility is one of the nice features
of Cassandra.

If I try to query all columns with "select * from clicks", Cassandra will happily run the
query and return the results to me. For example, if I use a driver like FluentCassandra, it
wouldn't know how to deserialize a given column. That means I have 2 options: the first is
to not use CQL for situations where the contents of a row vary in type, the second only select
columns that match the default name/value type.

Even though Cassandra transposes wide rows into multiple rows for the storage engine, is the
metadata about the type present? In one of my use cases, I built a temporal database on top
of cassandra. My temporal database only uses thrift and helper classes to read/write, so my
classes always know what types a dynamic column is.

This other use case where inserts and queries are made with thrift and CQL, returning type
metadata for the name in org.apache.cassandra.thrift.Column could make it easier. That's assuming
the metadata is stored in the transposed internal storage format.
                
> Return column metadata for dynamic columns
> ------------------------------------------
>
>                 Key: CASSANDRA-4964
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4964
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.1.6
>            Reporter: Peter Lin
>
> Currently, org.apache.cassandra.thrift.CqlMetadata doesn't return the column name and
value metadata for dynamic columns. If I execute a query against a dynamic column that was
inserted through thrift or hector, the name and/or value type may not be the same as the default
types declared in the column family definition.
> If the dynamic column was inserted through CQL, it will conform to the defined default
types for column name and value. Even in that case, it is still nice to have the metadata
returned. That will facilitate developing tools for CQL and make it easier on people writing
drivers for Cassandra.
> I'm willing to contribute to this, if someone points me to the right place. I've read
a lot of the core cassandra code, but I haven't gone through all of CQL yet.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message