hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lincoln Ritter" <linc...@lincolnritter.com>
Subject Re: Namenode Exceptions with S3
Date Fri, 11 Jul 2008 15:31:46 GMT
Thanks Tom!

Your explanation makes things a lot clearer.  I think that changing
the 'fs.default.name' to something like 'dfs.namenode.address' would
certainly be less confusing since it would clarify the purpose of
these values.



On Fri, Jul 11, 2008 at 4:21 AM, Tom White <tom.e.white@gmail.com> wrote:
> On Thu, Jul 10, 2008 at 10:06 PM, Lincoln Ritter
> <lincoln@lincolnritter.com> wrote:
>> Thank you, Tom.
>> Forgive me for being dense, but I don't understand your reply:
> Sorry! I'll try to explain it better (see below).
>> Do you mean that it is possible to use the Hadoop daemons with S3 but
>> the default filesystem must be HDFS?
> The HDFS daemons use the value of "fs.default.name" to set the
> namenode host and port, so if you set it to a s3 URI, you can't run
> the HDFS daemons. So in this case you would use the start-mapred.sh
> script instead of start-all.sh.
>> If that is the case, can I
>> specify the output filesystem on a per-job basis and can that be an S3
>> FS?
> Yes, that's exactly how you do it.
>> Also, is there a particular reason to not allow S3 as the default FS?
> You can allow S3 as the default FS, it's just that then you can't run
> HDFS at all in this case. You would only do this if you don't want to
> use HDFS at all, for example, if you were running a MapReduce job
> which read from S3 and wrote to S3.
> It might be less confusing if the HDFS daemons didn't use
> fs.default.name to define the namenode host and port. Just like
> mapred.job.tracker defines the host and port for the jobtracker,
> dfs.namenode.address (or similar) could define the namenode. Would
> this be a good change to make?
> Tom

View raw message