In the cassandra world the best approach is to create on CF with the name and address in it.  

Use a super CF with one super col for the user data and one super col for every address they have. Pull the entire row back every time you want to read the data. No need for joins.


On 18 Sep 2010, at 08:56, Alvin UW <> wrote:

Thanks Paul,

If we make a CF Name_Address(name, address) rather than an index, we have to maintain it, once any change happens in ID_Address(Id, address) ,  Name_ID(name, id). Besides, it also occupies some space.

In contrast, if Name_Address(name, address) is just an index, we can redirect the query to ID_Address(Id, address) ,  Name_ID(name, id) without the cost of maintenance.
Does it make sense?


2010/9/16 Rock, Paul <>
Alvin - assuming I understand what you're after correctly, why not make a CF Name_Address(name, address). Modifying the Cassandra methods to do the "join" you describe seems like overkill to me...


On Sep 15, 2010, at 7:34 PM, Alvin UW wrote:


I am going to build an index to join two CFs.
First, we see this index as a CF/SCF. The difference is I don't materialise it.
Assume we have two tables:
ID_Address(Id, address) ,  Name_ID(name, id)
Then,the index is: Name_Address(name, address)

When the application tries to query on Name_Address, the value of "name" is given by the application.
I want to direct the read operation  to Name_ID to get "Id" value, then go to ID_Address to
get the "address" value by the "Id" value. So far, I consider only the read operation.
By this way, the join query is transparent to the user.

So I think I should find out which methods or classes are in charge of the read operation in the above operation.
For example, the operation in cassandra CLI "get Keyspace1.Standard2['jsmith']" calls exactly which methods
in the server side?

I noted CassandraServer is used to listen to clients, and there are some methods such as get(), get_slice().
Is it the right place I can modify to implement my idea?