cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From time <t...@digg.com>
Subject Re: Cassandra users survey
Date Wed, 25 Nov 2009 23:38:36 GMT

>> 2) a practical/situational view of managing a cassandra cluster
>> ...
>> it would be nice to have a more comprehensive deployment guide.
>>     
> You're right.  Maybe we can get Digg to share theirs. :)
>   
We don't have any such thing. The deployment at Digg is just as alpha as 
the deployment anywhere else. The database team is still trying to 
figure out how to tune, monitor/alert on, and deploy the cluster. So far 
it's chaotic.

We have no experience with what to do when a node fails, a rack fails, 
or a datacentre fails.

Our experience with data corruption has been answered with "lose that 
data, hope the bug was fixed, redeploy next version up."

Our answer to "Cassandra performance has degraded in an unusual fashion" 
has been to shut Cassandra down and work on an upgrade path.

If anything, I might advise an entity undertaking a Cassandra deployment 
to "have developers on staff that can help you administer the cluster by 
way of hacking the source code" because, honestly, that's how we've done 
it thus far.

I expect once Cassandra features, architecture, and bugginess stabilise 
(I understand we're on the cusp of that now), the database team at Digg 
will take nearly 100% responsibility for the cluster, and at that point 
we will write extensive documentation about administering the cluster. 
My estimate is 3-9 months from now.

I guess since this is the users survey thread, I should list what I wish 
I had. I would love to have a CLI that can tell me:

         1. What's the keyspace?
         2. What column families exist?
         3. What supercolumns exist?
         4. What columns are part of a particular supercolumn?
         5. What is the key range for a given column family?
         6. What are the last N rows in this column family?
         7. What are the first N rows?
         8. If I query a key range M..N, what nodes would likely answer?
         9. For a given structure I can see, what is the underlying
            directory, file, memory, structure? What SStables make up
            this column family? Which are compacted? What are their
            sizes? How many tombstones are in each? Etc.

I would want this all from the point of view of a CLI. I would not want 
to have to login to any particular node via a shell to ask these 
questions (so "Just look at the XML config file!" is not the proper answer).

Think of a "shell" client of Cassandra that allows exploration and 
navigation by way of Cassandra-specific ls, cd, ps, cat, head, tail.

--
timeless


Mime
View raw message