incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jon Haddad <...@jonhaddad.com>
Subject Re: Struggling to understand CFS and its use.
Date Sun, 17 Nov 2013 19:36:28 GMT
Having used (and moved off of) Titan I do not recommend it as a primary database.  Until it
overcomes it’s extremely unoptimized graph traversals, it will increase the load on your
database by several orders of magnitude.  

As a secondary analytics database, it might do fine.  Just don’t rely on it for anything
time sensitive.  

Jon


On Nov 16, 2013, at 9:10 PM, Willie Slepecki <scphantm@gmail.com> wrote:

> Hi all.  I'm in the bar napkin phase of coming up with a big app.  The application is
going to be a large graph app so I was drawn to Cassandra because of Titan and the replication
of Cassandra is far superior to Neo4j and other open source systems I have looked at.
> 
> The last issue i'm dealing with before starting to write code is random file storage.
 The application will have the ability to upload whatever, images, pdf, etc, and i need to
put them somewhere.  (for the record, Amazon S3 is not an option, long story)  So i'm looking
at a hugely expensive raid array, or an insanely complex distributed file system.  Given the
budget im dealing with, most likely distributed file system.
> 
> Now in the past hour or so, i stumbled on CFS.  And I think i know what it is, and that
its not going to work for me, but I just wanted to make sure.  
> 
> From what I can tell, it is a file system that does not like small files (15k images
and such) because for each file you upload, its going to allocate a 2 meg block.  
> 
> Second, it looks like its similar to HDFS in that the FS is a misleading statement and
should have probably been named CDS (Cassandra Data Store).  I mean that in the sense, it
wasn't designed to map a drive to and drop files in with explorer, but intended more as a
convenient way to upload to your analytics engine (mapreduce or whatever) large files of structured
data to have back end processes rip apart and tell you cool things you didn't know.  Or for
us really old guys, think of it as an easy way to dump a butt load of data into your data
warehouse without having to write an ETL, and instead you write the ETL when you want to do
something with it.
> 
> Third, it looks like it commercial, from that stax something company.  
> 
> Am i wrong about any of this?
> 
> Thanks
> 
> -- 
> You want it fast, cheap, or right.  Pick two!!


Mime
View raw message