hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: Dealing with low space cluster
Date Thu, 14 Jun 2012 13:38:30 GMT

If by processing you mean trying to write out (map outputs) > 20 GB of
data per map task, that may not be possible, as the outputs need to be
materialized and the disk space is the constraint there.

Or did I not understand you correctly (in thinking you are asking
about MapReduce)? Cause you otherwise have ~50 GB space available for
HDFS consumption (assuming replication = 3 for proper reliability).

On Thu, Jun 14, 2012 at 1:25 PM, Ondřej Klimpera <klimpond@fit.cvut.cz> wrote:
> Hello,
> we're testing application on 8 nodes, where each node has 20GB of local
> storage available. What we are trying to achieve is to get more than 20GB to
> be processed on this cluster.
> Is there a way how to distribute the data on the cluster?
> There is also one shared NFS storage disk with 1TB of available space, which
> is now unused.
> Thanks for your reply.
> Ondrej Klimpera

Harsh J

View raw message