cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: Storing files in blob into Cassandra
Date Wed, 22 Jun 2011 12:28:58 GMT
> If the Cassandra JVM is down, Tomcat and Httpd will continue to handle requests. And Pelops
will redirect these requests to another Cassandra node on another server (maybe am I wrong
with this assertion).

I was thinking of the server been turned off / broken / rebooting / disconnected from the
network / taken out of rotation for maintenance. There are lots of reasons for a server to
not be doing what it should be. 


-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 22 Jun 2011, at 23:10, Damien Picard wrote:

> 
> 
> 2011/6/22 aaron morton <aaron@thelastpickle.com>
>> I think I have to detail my configuration. On every server of my cluster, I deploy
:
>>  - a Cassandra node
>>  - a Tomcat instance
>>  - the webapp, deployed on Tomcat
>>  - Apache httpd, in front of Tomcat with mod_jakarta
> 
> You will have a bunch of services on the machine competing with each other for resources
(cpu, memory and network IO). It's not an approach I would take. 
> 
> You will also tightly couple the front end HTTP capacity to the DB capacity. e.g. consider
what happens when a cassandra node is down for a while, what does this mean for your ability
to accept http connections?
> If the Cassandra JVM is down, Tomcat and Httpd will continue to handle requests. And
Pelops will redirect these requests to another Cassandra node on another server (maybe am
I wrong with this assertion).
>  
> Requests from your web app may go to the local cassandra node, but thats just the coordinator.
They will be forwarded onto the replicas that contain the data.  
> Yes, but as you notice before, this node can be down, so I will configure Pelops to redistribute
requests on another node. So there is no strong couple between Cassandra and Tomcat ; It will
works as if they was on different servers. 
> 
>> Data are stored with RandomPartitionner, replication factor is 2.
> 
> RF 3 is the minimum RF you need to use for QUORUM to be less than the RF. 
> Thank you for this advice ; I will reconsider  the RF, but for this time, I use only
CL.ONE, not QUORUM. But it could change in a near future.
> 
>> In such case, do you advise me to store files in Cassandra ?
> 
> Depends on your scale, workload and performance requirements. I would do some tests about
how much data you expect to hold and what sort of workloads you need to support.  Personally
I think files are best kept in a file system, until a compelling reason is found to do other
wise. 
> Thank you, I think that distributing files in the cluster with something like distributed
file systems is a compelling reason to store files on Cassandra. I don't want to add another
complex component to my arch.
> 
> Hope that helps. 
> 
> It does ! A lot ! Thank you. 
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 22 Jun 2011, at 20:23, Damien Picard wrote:
> 
>> >store your images / documents / etc. somewhere and reference them
>> >in Cassandra.  That's the consensus that's been bandied about on this
>> >list quite frequently
>> 
>> Thank you for your answers.
>> 
>> I think I have to detail my configuration. On every server of my cluster, I deploy
:
>>  - a Cassandra node
>>  - a Tomcat instance
>>  - the webapp, deployed on Tomcat
>>  - Apache httpd, in front of Tomcat with mod_jakarta
>> 
>> In front of these, I use a Round-Robin DNS load balancer which balance request on
every httpd.
>> Every Tomcat instance can access every Cassandra node, allowing them to deal with
every request.
>> Data are stored with RandomPartitionner, replication factor is 2.
>> 
>> In my case, it would be very easy to store images in Cassandra because these images
will be accessible everywhere in my cluster. If I store images in FileSystem, I have to replicate
them manually (probably with a distributed filesystem) on every server (quite complicated).
This is why I prefer to store files into Cassandra.
>> 
>> According to Sylvain, the main thing to know is the max size of a file. In so far
as I am on a web purpose, I can define this max file size to 10 Mb (HTTP POST max size) without
disapointing my users.Furthermore, most of these files will not exceed 2 or 3 Mb. In such
case, do you advise me to store files in Cassandra ?
>> 
>> Thank you.
>> 
>> 2011/6/22 Sylvain Lebresne <sylvain@datastax.com>
>> Let's be more precise in saying that this all depends on the
>> expected size of the documents. If you know that the documents
>> will be on the few hundreds kilobytes mark on average and
>> no more than a few megabytes (say < 5MB, even though there is
>> no magic number), then storing them as blob will work perfectly
>> fine (which is not saying storing them externally with metadata in
>> Cassandra won't, but using blobs can be simpler in some cases).
>> 
>> I've very successfully stored tons of images as blobs in Cassandra.
>> I just knew they couldn't get super big because the system wasn't
>> allowing it.
>> 
>> The point with the size being that each time you will get a document,
>> Cassandra will have to load it (entirely) in memory to return it.
>> 
>> --
>> Sylvain
>> 
>> 
>> On Wed, Jun 22, 2011 at 9:22 AM, Sasha Dolgy <sdolgy@gmail.com> wrote:
>> > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Storing-photos-images-docs-etc-td6078278.html
>> >
>> > Of significance from that link (which was great until feeling lucky
>> > was removed...):
>> >
>> > Google of terms cassandra large files + feeling lucky
>> > http://www.google.com/search?q=cassandra+large+files&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
>> >
>> > Yields:
>> > http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
>> >
>> >
>> > --- store your images / documents / etc. somewhere and reference them
>> > in Cassandra.  That's the consensus that's been bandied about on this
>> > list quite frequently.  we employ a solution that uses Amazon S3 for
>> > storage and Cassandra as the reference to the meta data and location
>> > of the files.  works a treat
>> >
>> >
>> > On Wed, Jun 22, 2011 at 9:07 AM, Damien Picard <picard.damien@gmail.com>
wrote:
>> >> Hi,
>> >>
>> >> I have to store some files (Images, documents, etc.) for my users in a
>> >> webapp. I use Cassandra for all of my data and I would like to know if this
>> >> is a good idea to store these files into blob on a Cassandra CF ?
>> >> Is there some contraindications, or special things to know to achieve this
?
>> >>
>> >> Thank you
>> >
>> 
>> 
>> 
>> -- 
>> Damien Picard
>> Axeiya Services : http://axeiya.com/
>> gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
>> Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html
>> 
> 
> 
> 
> 
> -- 
> Damien Picard
> Axeiya Services : http://axeiya.com/
> gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
> Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html
> 


Mime
View raw message