incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Drew Kutcharian <d...@venarc.com>
Subject Re: Data Modeling: How to keep track of arbitrarily inserted column names?
Date Thu, 04 Apr 2013 21:45:06 GMT
Hi Edward,

I anticipate that the column names will be reused a lot. For example, key1 will be in many
rows. So I think the number of distinct column names will be much much smaller than the number
of rows. Is there a way to have a separate CF that keeps track of the column names? 

What I was thinking was to have a separate CF that I write only the column name with a null
value in there every time I write a key/value to the main CF. In this case if that column
name exist, then it will just be overridden. Now if I wanted to get all the column names,
then I can just query that CF. Not sure if that's the best approach at high load (100k inserts
a second).

-- Drew


On Apr 4, 2013, at 12:02 PM, Edward Capriolo <edlinuxguru@gmail.com> wrote:

> You can not get only the column name (which you are calling a key) you can use get_range_slice
which returns all the columns. When you specify an empty byte array (new byte[0]{}) as the
start and finish you get back all the columns. From there you can return only the columns
to the user in a format that you like.
> 
> 
> On Thu, Apr 4, 2013 at 2:18 PM, Drew Kutcharian <drew@venarc.com> wrote:
> Hey Guys,
> 
> I'm working on a project and one of the requirements is to have a schema free CF where
end users can insert arbitrary key/value pairs per row. What would be the best way to know
what are all the "keys" that were inserted (preferably w/o any locking). For example,
> 
> Row1 => key1 -> XXX, key2 -> XXX
> Row2 => key1 -> XXX, key3 -> XXX
> Row3 => key4 -> XXX, key5 -> XXX
> Row4 => key2 -> XXX, key5 -> XXX
> …
> 
> The query would be give me all the inserted keys and the response would be {key1, key2,
key3, key4, key5}
> 
> Thanks,
> 
> Drew
> 
> 


Mime
View raw message