hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Brisk vs Cloudera Distribution
Date Thu, 09 Feb 2012 04:57:54 GMT
Hadoop can work on a number of filessytems hdfs , s3. Local files. Brisk
file system is known as cfs. Cfs stores all block and meta data in
cassandra. Thus it does not use a name node. Brisk fires up a jobtracker
automatically as well. Brisk also has a hivemeta store backed by cassandra
so takes away that spof.

Brisk snappy compresses all data so you may not need to use compression or
sequence files. Performance wise I have gotten comparable numbers with tera
sort and tera gen. But the system work vastly differently and likely it
scales differently.

The hive integration is solid. Not sure what the biggest cluster is or
making other vague performance claims. Brisk is not active anymore the
commercial product is dse. There is a github fork of brisk however.

On Wednesday, February 8, 2012, rk vishu <talk2hadoop@gmail.com> wrote:
> Hello All,
> Could any one help me understand pros and cons of Brisk vs Cloudera Hadoop
> (DHFS + HBASE) in terms of functionality and performance?
> Wanted to keep aside the single point of failure (NN) issue while
> Are there any big clusters in petabytes using brisk in production? How is
> the performance comparision CFS vs HDFS? How is Hive integration?
> Thanks and Regrds
> RK

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message