cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rustam Aliyev <>
Subject Re: CassandraFS in 1.0?
Date Tue, 12 Jul 2011 10:10:57 GMT
Hi David,

This is interesting topic and it would be interesting to hear from 
someone who is using it in prod.

Particularly - How your fs implementation behaves for medium/large 
files, e.g. > 1MB?

If you store large files, how large is your store per node and how does 
it handle compactions (any performance issues while compacting large data)?

Also would be interesting to hear some benchmarks and performance stats 
for read/writes.


On 12/07/2011 04:51, David Strauss wrote:
> It's not, currently, but I'm happy to answer questions about its architecture.
> On Thu, Jul 7, 2011 at 10:35, Norman Maurer
> <>  wrote:
>> May I ask if its opensource by any chance ?
>> bye
>> norman
>> Am Donnerstag, 7. Juli 2011 schrieb David Strauss<>:
>>> I'm not sure HDFS has the right properties for a media-storage file
>>> system. We have, however, built a WebDAV server on top of Cassandra
>>> that avoids any pretension of being a general-purpose, POSIX-compliant
>>> file system. We mount it on our servers using davfs2, which is also
>>> nice for a few reasons:
>>> * We can use standard HTTP load-balancing and dead host avoidance
>>> strategies with WebDAV.
>>> * Encrypting access and authenticating clients with PKI/HTTPS works seamlessly.
>>> * WebDAV + davfs2 is etag-header aware, allowing clients to
>>> efficiently validate cached items.
>>> * HTTP is browser and CDN/reverse proxy cache friendly for
>>> distributing content to people who don't need to mount the file
>>> system.
>>> * We could extend the server's support to allow connections from a
>>> broad variety of interactive desktop clients.
>>> On Wed, Jul 6, 2011 at 13:11, Joseph Stein<>  wrote:
>>>> Hey folks, I am going to start prototyping our media tier using cassandra
>>>> a file system (meaning upload video/audio/images to web server save in
>>>> cassandra and then streaming them out)
>>>> Has anyone done this before?
>>>> I was thinking brisk's CassandraFS might be a fantastic implementation for
>>>> this but then I feel that I need to run another/different Cassandra cluster
>>>> outside of what our ops folks do with Apache Cassandra 0.8.X
>>>> Am I best to just compress files uploaded to the web server and then start
>>>> chunking and saving chunks in rows and columns so the mem issue does not
>>>> smack me in the face?  And use our existing cluster and build it out
>>>> accordingly?
>>>> I am sure our ops people would like the command line aspect of CassandraFS
>>>> but looking for something that makes the most sense all around.
>>>> It seems to me there is a REALLY great thing in CassandraFS and would love
>>>> to see it as part of 1.0 =8^)  or at a minimum some streamlined
>>>> implementation to-do the same thing.
>>>> If comparing to HDFS that is part of Hadoop project even though Cloudera
>>>> a distribution of Hadoop :) maybe that can work here too _fingers_crosed_
>>>> (or mongodb->gridfs)
>>>> happy to help as I am moving down this road in general
>>>> Thanks!
>>>> /*
>>>> Joe Stein
>>>> Twitter: @allthingshadoop
>>>> */
>>> --
>>> David Strauss
>>>     |
>>>     | +1 512 577 5827 [mobile]

View raw message