cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Storing files in blob into Cassandra
Date Wed, 22 Jun 2011 09:59:45 GMT
> I think I have to detail my configuration. On every server of my cluster, I deploy :
>  - a Cassandra node
>  - a Tomcat instance
>  - the webapp, deployed on Tomcat
>  - Apache httpd, in front of Tomcat with mod_jakarta

You will have a bunch of services on the machine competing with each other for resources (cpu,
memory and network IO). It's not an approach I would take. 

You will also tightly couple the front end HTTP capacity to the DB capacity. e.g. consider
what happens when a cassandra node is down for a while, what does this mean for your ability
to accept http connections?
 
Requests from your web app may go to the local cassandra node, but thats just the coordinator.
They will be forwarded onto the replicas that contain the data.  

> Data are stored with RandomPartitionner, replication factor is 2.

RF 3 is the minimum RF you need to use for QUORUM to be less than the RF. 

> In such case, do you advise me to store files in Cassandra ?

Depends on your scale, workload and performance requirements. I would do some tests about
how much data you expect to hold and what sort of workloads you need to support.  Personally
I think files are best kept in a file system, until a compelling reason is found to do other
wise. 

Hope that helps. 
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jun 2011, at 20:23, Damien Picard wrote:

> >store your images / documents / etc. somewhere and reference them
> >in Cassandra.  That's the consensus that's been bandied about on this
> >list quite frequently
> 
> Thank you for your answers.
> 
> I think I have to detail my configuration. On every server of my cluster, I deploy :
>  - a Cassandra node
>  - a Tomcat instance
>  - the webapp, deployed on Tomcat
>  - Apache httpd, in front of Tomcat with mod_jakarta
> 
> In front of these, I use a Round-Robin DNS load balancer which balance request on every
httpd.
> Every Tomcat instance can access every Cassandra node, allowing them to deal with every
request.
> Data are stored with RandomPartitionner, replication factor is 2.
> 
> In my case, it would be very easy to store images in Cassandra because these images will
be accessible everywhere in my cluster. If I store images in FileSystem, I have to replicate
them manually (probably with a distributed filesystem) on every server (quite complicated).
This is why I prefer to store files into Cassandra.
> 
> According to Sylvain, the main thing to know is the max size of a file. In so far as
I am on a web purpose, I can define this max file size to 10 Mb (HTTP POST max size) without
disapointing my users.Furthermore, most of these files will not exceed 2 or 3 Mb. In such
case, do you advise me to store files in Cassandra ?
> 
> Thank you.
> 
> 2011/6/22 Sylvain Lebresne <sylvain@datastax.com>
> Let's be more precise in saying that this all depends on the
> expected size of the documents. If you know that the documents
> will be on the few hundreds kilobytes mark on average and
> no more than a few megabytes (say < 5MB, even though there is
> no magic number), then storing them as blob will work perfectly
> fine (which is not saying storing them externally with metadata in
> Cassandra won't, but using blobs can be simpler in some cases).
> 
> I've very successfully stored tons of images as blobs in Cassandra.
> I just knew they couldn't get super big because the system wasn't
> allowing it.
> 
> The point with the size being that each time you will get a document,
> Cassandra will have to load it (entirely) in memory to return it.
> 
> --
> Sylvain
> 
> 
> On Wed, Jun 22, 2011 at 9:22 AM, Sasha Dolgy <sdolgy@gmail.com> wrote:
> > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Storing-photos-images-docs-etc-td6078278.html
> >
> > Of significance from that link (which was great until feeling lucky
> > was removed...):
> >
> > Google of terms cassandra large files + feeling lucky
> > http://www.google.com/search?q=cassandra+large+files&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
> >
> > Yields:
> > http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
> >
> >
> > --- store your images / documents / etc. somewhere and reference them
> > in Cassandra.  That's the consensus that's been bandied about on this
> > list quite frequently.  we employ a solution that uses Amazon S3 for
> > storage and Cassandra as the reference to the meta data and location
> > of the files.  works a treat
> >
> >
> > On Wed, Jun 22, 2011 at 9:07 AM, Damien Picard <picard.damien@gmail.com> wrote:
> >> Hi,
> >>
> >> I have to store some files (Images, documents, etc.) for my users in a
> >> webapp. I use Cassandra for all of my data and I would like to know if this
> >> is a good idea to store these files into blob on a Cassandra CF ?
> >> Is there some contraindications, or special things to know to achieve this ?
> >>
> >> Thank you
> >
> 
> 
> 
> -- 
> Damien Picard
> Axeiya Services : http://axeiya.com/
> gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
> Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html
> 


Mime
View raw message