cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Picard <>
Subject Re: Storing files in blob into Cassandra
Date Thu, 23 Jun 2011 07:36:50 GMT
>I have a simple blob class over the top of this which handles input and
output streaming so reads/writes are only one column at a >time

Thank you for the tips. I think I will do the same ; for this time, I've
developped a simple version which store the entire file in one column, but
I've already observe that it is a performance killer.
According to you, the idea is to write a "CassandraInputStream" and
"CassandraOutputStream" that store files in multiple columns on one row.
Could you tell me the size you put on a single column ? Have you benchmark
this to determine an ideal column size ?

Thank you.

2011/6/23 Sasha Dolgy <>

> maybe you want to spend a few minutes reading about Haystack over at
> facebook to give you some ideas...
> Not saying what they've done is the right way... just sayin'
> On Thu, Jun 23, 2011 at 6:29 AM, AJ <> wrote:
> >
> > I was thinking of doing the same thing.  But, to compensate for the
> > bandwidth usage during the read, I was hoping to find a way for the httpd
> or
> > app server to cache the file either in RAM or on disk so subsequent reads
> > could just reference the in-mem cache or local hdd.  I have big data
> > requirements, so duplicating the storage of file blobs by adding them to
> the
> > hdd would almost double my storage requirements.  So, the hdd cache would
> > have to be limited with the LRU removed periodically.
> >
> > I was thinking about making the key for each file be a relative file path
> as
> > if it were on disk.  This same path could also be used as it's actual
> > location on disk in the local disk cache.  Using a path as the key makes
> it
> > flexible in many ways if I ever change my mind and want to store all
> files
> > on disk, or when backing-up or archiving, etc..
> >
> > But, I'm rusty on my apache http knowledge but I also thought there was
> an
> > apache cache mod that would use both ram and disk depending on the
> frequency
> > of use.  But, I don't know if you can tell it to "cache this blob like
> it's
> > a file".
> >
> > Just some thoughts.

Damien Picard
Axeiya Services :
gwt-ckeditor :
Mon livre sur GWT :

View raw message