incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Picard <picard.dam...@gmail.com>
Subject Re: Storing files in blob into Cassandra
Date Wed, 22 Jun 2011 12:43:23 GMT
In this case, the load balancer has to detect (or is configured) that the
server is down and does not route request to this one anymore.

2011/6/22 aaron morton <aaron@thelastpickle.com>

> If the Cassandra JVM is down, Tomcat and Httpd will continue to handle
> requests. And Pelops will redirect these requests to another Cassandra node
> on another server (maybe am I wrong with this assertion).
>
>>
> I was thinking of the server been turned off / broken / rebooting /
> disconnected from the network / taken out of rotation for maintenance. There
> are lots of reasons for a server to not be doing what it should be.
>
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 22 Jun 2011, at 23:10, Damien Picard wrote:
>
>
>
> 2011/6/22 aaron morton <aaron@thelastpickle.com>
>
>> I think I have to detail my configuration. On every server of my cluster,
>> I deploy :
>>  - a Cassandra node
>>  - a Tomcat instance
>>  - the webapp, deployed on Tomcat
>>  - Apache httpd, in front of Tomcat with mod_jakarta
>>
>>
>> You will have a bunch of services on the machine competing with each other
>> for resources (cpu, memory and network IO). It's not an approach I would
>> take.
>>
>> You will also tightly couple the front end HTTP capacity to the DB
>> capacity. e.g. consider what happens when a cassandra node is down for a
>> while, what does this mean for your ability to accept http connections?
>>
> If the Cassandra JVM is down, Tomcat and Httpd will continue to handle
> requests. And Pelops will redirect these requests to another Cassandra node
> on another server (maybe am I wrong with this assertion).
>
>>
>> Requests from your web app may go to the local cassandra node, but thats
>> just the coordinator. They will be forwarded onto the replicas that contain
>> the data.
>>
> Yes, but as you notice before, this node can be down, so I will configure
> Pelops to redistribute requests on another node. So there is no strong
> couple between Cassandra and Tomcat ; It will works as if they was on
> different servers.
>
>>
>> Data are stored with RandomPartitionner, replication factor is 2.
>>
>>
>> RF 3 is the minimum RF you need to use for QUORUM to be less than the RF.
>>
> Thank you for this advice ; I will reconsider  the RF, but for this time, I
> use only CL.ONE, not QUORUM. But it could change in a near future.
>
>>
>> In such case, do you advise me to store files in Cassandra ?
>>
>>
>> Depends on your scale, workload and performance requirements. I would do
>> some tests about how much data you expect to hold and what sort of workloads
>> you need to support.  Personally I think files are best kept in a file
>> system, until a compelling reason is found to do other wise.
>>
> Thank you, I think that distributing files in the cluster with something
> like distributed file systems is a compelling reason to store files on
> Cassandra. I don't want to add another complex component to my arch.
>
>>
>> Hope that helps.
>>
>
> It does ! A lot ! Thank you.
>
>>  -----------------
>> Aaron Morton
>> Freelance Cassandra Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 22 Jun 2011, at 20:23, Damien Picard wrote:
>>
>> >store your images / documents / etc. somewhere and reference them
>> >in Cassandra.  That's the consensus that's been bandied about on this
>> >list quite frequently
>>
>> Thank you for your answers.
>>
>> I think I have to detail my configuration. On every server of my cluster,
>> I deploy :
>>  - a Cassandra node
>>  - a Tomcat instance
>>  - the webapp, deployed on Tomcat
>>  - Apache httpd, in front of Tomcat with mod_jakarta
>>
>> In front of these, I use a Round-Robin DNS load balancer which balance
>> request on every httpd.
>> Every Tomcat instance can access every Cassandra node, allowing them to
>> deal with every request.
>> Data are stored with RandomPartitionner, replication factor is 2.
>>
>> In my case, it would be very easy to store images in Cassandra because
>> these images will be accessible everywhere in my cluster. If I store images
>> in FileSystem, I have to replicate them manually (probably with a
>> distributed filesystem) on every server (quite complicated). This is why I
>> prefer to store files into Cassandra.
>>
>> According to Sylvain, the main thing to know is the max size of a file. In
>> so far as I am on a web purpose, I can define this max file size to 10 Mb
>> (HTTP POST max size) without disapointing my users.Furthermore, most of
>> these files will not exceed 2 or 3 Mb. In such case, do you advise me to
>> store files in Cassandra ?
>>
>> Thank you.
>>
>> 2011/6/22 Sylvain Lebresne <sylvain@datastax.com>
>>
>>> Let's be more precise in saying that this all depends on the
>>> expected size of the documents. If you know that the documents
>>> will be on the few hundreds kilobytes mark on average and
>>> no more than a few megabytes (say < 5MB, even though there is
>>> no magic number), then storing them as blob will work perfectly
>>> fine (which is not saying storing them externally with metadata in
>>> Cassandra won't, but using blobs can be simpler in some cases).
>>>
>>> I've very successfully stored tons of images as blobs in Cassandra.
>>> I just knew they couldn't get super big because the system wasn't
>>> allowing it.
>>>
>>> The point with the size being that each time you will get a document,
>>> Cassandra will have to load it (entirely) in memory to return it.
>>>
>>> --
>>> Sylvain
>>>
>>>
>>> On Wed, Jun 22, 2011 at 9:22 AM, Sasha Dolgy <sdolgy@gmail.com> wrote:
>>> >
>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Storing-photos-images-docs-etc-td6078278.html
>>> >
>>> > Of significance from that link (which was great until feeling lucky
>>> > was removed...):
>>> >
>>> > Google of terms cassandra large files + feeling lucky
>>> >
>>> http://www.google.com/search?q=cassandra+large+files&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a
>>> >
>>> > Yields:
>>> > http://wiki.apache.org/cassandra/FAQ#large_file_and_blob_storage
>>> >
>>> >
>>> > --- store your images / documents / etc. somewhere and reference them
>>> > in Cassandra.  That's the consensus that's been bandied about on this
>>> > list quite frequently.  we employ a solution that uses Amazon S3 for
>>> > storage and Cassandra as the reference to the meta data and location
>>> > of the files.  works a treat
>>> >
>>> >
>>> > On Wed, Jun 22, 2011 at 9:07 AM, Damien Picard <
>>> picard.damien@gmail.com> wrote:
>>> >> Hi,
>>> >>
>>> >> I have to store some files (Images, documents, etc.) for my users in
a
>>> >> webapp. I use Cassandra for all of my data and I would like to know
if
>>> this
>>> >> is a good idea to store these files into blob on a Cassandra CF ?
>>> >> Is there some contraindications, or special things to know to achieve
>>> this ?
>>> >>
>>> >> Thank you
>>> >
>>>
>>
>>
>>
>> --
>> Damien Picard
>> Axeiya Services : http://axeiya.com/
>> gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
>> Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html
>>
>>
>>
>
>
> --
> Damien Picard
> Axeiya Services : http://axeiya.com/
> gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
> Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html
>
>
>


-- 
Damien Picard
Axeiya Services : http://axeiya.com/
gwt-ckeditor : http://code.google.com/p/gwt-ckeditor/
Mon livre sur GWT : http://axeiya.com/index.php/ouvrage-gwt.html

Mime
View raw message