We do pretty much the same thing here, dynamic column with a timestamp for column name and a different value type for each row. We use the serialization/deserialization classes provided with Hector and store the type of the value in the key of the row. Example of row key:
"b6c8a1e7281761e62230ea76daa3d841#INT" => every values are Integer
"7f30a6a2bbb1b921afc8216d8c5d9257#DOUBLE" => every values are Double
If I'll have to do it again, I'll try to use (Dynamic)CompositeType for value or an equivalent mechanism as suggested by Roland.

On 3 July 2011 15:07, Roland Gude <roland.gude@yoochoose.com> wrote:
You could do the serialization for all your supported datatypes yourself (many libraries for serialization are available and a pretty thorough benchmarking for them can be found here: https://github.com/eishay/jvm-serializers/wiki) and prepend the serialized bytes with an identifier for your datatype.
This would not avoid casting though but would still be better performing then serializing to strings as it is done in your example.
Prepending the values with the id seems to be better to me, because you can be sure that a new insertion to some field overwrites the correct column even if it changed the type.

-----Ursprüngliche Nachricht-----
Von: osishkin osishkin [mailto:osishkin@gmail.com]
Gesendet: Sonntag, 3. Juli 2011 13:52
An: user@cassandra.apache.org
Betreff: Multi-type column values in single CF

Hi all,

I need to store column values that are of various data types in a
single column family, i.e I have column values that are integers,
others that are strings, and maybe more later. All column names are
strings (no comparator problem for me).
The thing is I need to store unstructured data - I do not have fixed
and known-in-advacne column names, so I can not use a fixed static map
for casting the values back to their original type on retrieval from

My immediate naive thought is to simply prefix every column name with
the type the value needs to be cast back to.
For example i'll do the follwing conversion to the columns of some key -
{'attr1': 'val1','attr2': 100}  ~> {'str_attr1' : 'val1', 'int_attr2' : '100'}
and only then send it to cassandra. This way I know to what should I
cast it back.

But all this casting back and forth on the client side seems to me to
be very bad for performance.
Another option is to split the columns on dedicated column families
with mathcing validation types - a column family for integer values,
one for string, one for timestamp etc.
But that does not seem very efficient either (and worse for any
rollback mechanism), since now I have to perform several get calls on
multiple CFs where once I had only one.

I thought perhaps someone has encountered a similar situation in the
past, and can offer some advice on the best course of action.

Thank you,