hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer ...@yahoo-inc.com>
Subject Re: Multicluster Communication
Date Fri, 19 Jun 2009 17:07:52 GMT
On 6/19/09 3:49 AM, "Harish Mallipeddi" <harish.mallipeddi@gmail.com> wrote:
> Why do you want to do this in the first place? It seems like you want
> cluster1 to be a plain HDFS cluster and cluster2 to be a mapred cluster.
> Doing something like that will be disastrous - Hadoop is all about sending
> computation closer to your data. If you don't want that, you need not even
> use hadoop.

    Given some of the limitations with HDFS (quota operability, security), I
can easily why it would be desirable to have static data coming from one
grid while doing computation/intermediate outputs/real output to another.

    Using performance as your sole metric of viability is a bigger disaster
waiting to happen.  "Sure, we crashed the file system, but look how fast it
went down in flames!"

View raw message