hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <ste...@apache.org>
Subject Re: Hey Cloudera can you help us In beating Google Yahoo Facebook?
Date Mon, 05 Oct 2009 09:39:44 GMT
Smith Stan wrote:
> Hey Cloudera genius guys .

Sorry, not cloudera. I speak for myself.

> I read this
> Via Cloudera, Hadoop is currently used by most of the giants in the
> space including Google, Yahoo, Facebook (we wrote about Facebook’s use
> of Cloudera here), Amazon, AOL, Baidu and more.

I would be doubful that any on that list use the cloudera distro, 
because once you manage a cluster to the extent you create your own RPMs 
for PXE-preboot and kickstart install then you know what you are doing 
and will be worrying more about the power budget of your datacentre -as 
measured in megawatts-, and whether your off-site replication plan is 
copying data to other facilities on different earthquake fault lines for 
than how hadoop-site.xml works.

> On.
> http://www.techcrunch.com/2009/10/01/hadoop-clusters-get-a-monitoring-client-with-cloudera-desktop/
> if this is true can you guys help us beat Y G and F.

This is not much different from saying these companies all use TCP/IP, 
Http, MySQL and Linux, therefore a Linux server running apache and 
mysqld will help you to beat them.

Hadoop is a tool for very large datasets, works best if you can group 
and scan them independently.

* If you do not know what you are doing, it will not help
* if you do not have a sufficiently large dataset, it is not worth the 
* if you havent outgrown an RDBMS, stick with the database
* Cloudera are offering to help with running/using hadoop, but they 
aren't going to code your datamining algorithms for you.

see also: http://teddziuba.com/2008/04/im-going-to-scale-my-foot-up-y.html


View raw message