incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ian Holsman <>
Subject Re: Scalable data model for a Metadata database
Date Thu, 11 Feb 2010 00:39:19 GMT
Hi Jared.
you might want to look at graph databases (hypergraphDB or neo4j for example) for use cases
like this. 
what it seems like you are asking for is a semantic knowledge base ala

tools like protégé ( ) and gremlin ( are helpful
for this kind of thing as well.

the other issue you are going to encounter is when you want to link up 2 things.

for example marriage.
find all people whose sex == ‘male’ and age >= 20 and age <= 29 and is married to
people called michelle who is older than 27.


On Feb 10, 2010, at 3:51 AM, Jared winick wrote:

> Thanks for the specific suggestions Jonathan, I really appreciate it.
> On Tue, Feb 9, 2010 at 9:37 AM, Jonathan Ellis <> wrote:
>> On Tue, Feb 9, 2010 at 10:01 AM, Jared winick <> wrote:
>>> Somehow I need to partition the data better.  Would a recommendation
>>> be to “split” the “sex” key into multiple keys? For example I could
>>> append the year and month to the key (“sex_022010”) to partition the
>>> data by the month it was insert.
>> That's one possibility.  Another would be to kill two birds with one
>> stone and add the age to that key, so you'd have male_20 (probably
>> better: male_1990), etc.
>> Fundamentally TANSTAAFL and if you need to scale queries w/ lots of
>> criteria like this you will have to choose (sometimes from more than
>> one of) these options:
>>  - have a lot of machines so you can parallelize brute force queries,
>> e.g. w/ Hadoop
>>  - precompute specific "indexes" like sex_birthdate above
>>   - note, with supercolumns you can also materialize the whole
>> "person" in subcolumns, rather than doing an extra lookup for each
>> index hit
>>  - use less-specific indexes (e.g. separate sex & birthdate indexes to
>> continue the example) and do more work on the client
>> -Jonathan

Ian Holsman

View raw message