hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Malcolm Matalka" <mmata...@millennialmedia.com>
Subject Persistent HDFS On EC2
Date Wed, 11 Mar 2009 13:30:52 GMT
If this is not the correct place to ask Hadoop + EC2 questions please
let me know.


I am trying to get a handle on how to use Hadoop on EC2 before
committing any money to it.  My question is, how do I maintain a
persistent HDFS between restarts of instances.  Most of the tutorials I
have found involve the cluster being wiped once all the instances are
shut down but in my particular case I will be feeding output of a
previous days run as the input of the current days run and this data
will get large over time.  I see I can use s3 as the file system, would
I just create an EBS  volume for each instance?  What are my options?




  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message