incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Даниел Симеонов <dsimeo...@gmail.com>
Subject question about how columns are deserialized in memory
Date Wed, 28 Apr 2010 11:56:22 GMT
Hi,
   I have a question about if a row in a Column Family has only columns
whether all of the columns are deserialized in memory if you need any of
them? As I understood it is the case, and if the Column Family is super
Column Family, then only the Super Column (entire) is brought up in memory?
What about row cache, is it different than memtable?
I have another one question, let's say there is only data to be inserted and
a solution to it is to have columns to be added to rows in Column Family, is
it possible in Cassandra to split the row if certain threshold is reached,
say 100 columns per row, what if there are concurrent inserts?
The original data model and use case is to insert timestamped data and to
make range queries. The original keys of CF rows were in the form of
<id>.<timestamp> and then a single column with data, OPP was used. This is
not an optimal solution, since nodes are hotter than others, I am thinking
of changing the model in the way to have keys like <id>.<year/month/day> and
then a list of columns with timestamps within this range and
RandomPartitioner or using OPP but preprocess part of the key with MD5, i.e.
the key is MD5(<id>.<year/month/day>) + "hour of the day" . Just the problem
is how to deal with large number of columns being inserted in a particular
row.
Thank you very much!
Best regards, Daniel.

Mime
View raw message