incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcelo Elias Del Valle <>
Subject Re: best design
Date Thu, 27 Sep 2012 14:25:31 GMT
2012/9/27 Andre Tavares <>

> create column family users_test with comparator=UTF8Type and
> column_metadata=[
> {column_name: generic_key, validation_class: UTF8Type, index_type: KEYS},
> {column_name: user_key, validation_class: UTF8Type, index_type: KEYS}
> ];
> where generic_id can be: user_cook_id value, or a user_facebook_id,
> user_cell_phone, user_personal_id values ... the "problem" of this solution
> is that I have 200 million users_id x 4 keys (user_cook_id,
> user_facebook_id, user_cell_phone, user_personal_id) = 800 million rows

If I understood correctly, if any key is the same, the use is the same. So
your row key is generic_id and for each generic key you want to find the
corresponding user.

Your search, if I understood correctly, is: "find user by generic_id"

The way you designed, there are no partitions. I don't know Cassandra well,
but I am not sure of what would happen if you have 3 billion users, for
instance. You would have 12 billion rows... Would Cassandra have any
problem to find the user by row key? How would Cassandra index these rows?

Marcelo Elias Del Valle - @mvallebr

View raw message