hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dhiraj <jar...@gmail.com>
Subject Re: S3 with Hadoop 2.5.0 - Not working
Date Thu, 11 Sep 2014 08:13:59 GMT
Hi Harsh,

I am a newbie to hadoop.
I am able to start the nodes with hadoop 1.1.x  and hadoop 1.2.x versions
for the following property; but not with 2.5.0(fs.defaultFS)
for 1.*.* releases i dont need to specify hdfs:// like you suggested; it
works with s3://

<property>
        <name>fs.default.name</name>
        <value>s3://bucket1</value>
    </property>

Is there any configuration that i need to do with 2.5.0 release. classpath;
etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there
any way of increasing the log level. I want to know what address and port
is the following command resolving to. I have my s3 object store server
running on some address and port; so would like to know if "hadoop fs -ls
s3://" connects to amazon or to my server.
I dont see any information in namenode/datanode logs.

cheers,
Dhiraj







On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <harsh@cloudera.com> wrote:

> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
> configured.
> > Starting namenodes on []
>
> NameNode/DataNode are part of a HDFS service. It makes no sense to try
> and run them over an S3 URL default, which is a distributed filesystem
> in itself. The services need fs.defaultFS to be set to a HDFS URI to
> be able to start up.
>
> > but unable to get an s3 config started via hadoop
>
> You can run jobs over S3 input and output data by running a regular MR
> cluster on HDFS - just pass the right URI as input and output
> parameters of the job. Set your S3 properties in core-site.xml but let
> the fs.defaultFS be of HDFS type, to do this.
>
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support s3 or do i need to do anything else.
>
> In Apache Hadoop 2 we dynamically load the FS classes, so we do not
> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.
>
> On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <jarihd@gmail.com> wrote:
> > Hi,
> >
> > I have downloaded hadoop-2.5.0 and am trying to get it working for s3
> > backend (single-node in a pseudo-distributed mode).
> > I have made changes to the core-site.xml according to
> > https://wiki.apache.org/hadoop/AmazonS3
> >
> > I have an backend object store running on my machine that supports S3.
> >
> > I get the following message when i try to start the daemons
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> >
> >
> > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> > Incorrect configuration: namenode address
> dfs.namenode.servicerpc-address or
> > dfs.namenode.rpc-address is not configured.
> > Starting namenodes on []
> > localhost: starting namenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> > localhost: starting datanode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> > Starting secondary namenodes [0.0.0.0]
> > 0.0.0.0: starting secondarynamenode, logging to
> > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out
> > root@ubuntu:/build/hadoop/hadoop-2.5.0#
> >
> > The deamons dont start after the above.
> > i get the same error if i add the property "fs.defaultFS" and set its
> value
> > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine
> - am
> > able to launch the daemons.
> >
> > my core-site.xml:
> > <configuration>
> >     <property>
> >         <name>fs.defaultFS</name>
> >         <value>s3://bucket1</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsAccessKeyId</name>
> >         <value>abcd</value>
> >     </property>
> >     <property>
> >         <name>fs.s3.awsSecretAccessKey</name>
> >         <value>1234</value>
> >     </property>
> > </configuration>
> >
> >
> > I am able to list the buckets and its contents via s3cmd and boto; but
> > unable to get an s3 config started via hadoop
> >
> > Also from the following core-file.xml listed on the website; i dont see
> an
> > implementation for s3
> >
> http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml
> >
> > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release
> support
> > s3 or do i need to do anything else.
> >
> > cheers,
> > Dhiraj
> >
> >
> >
>
>
>
> --
> Harsh J
>

Mime
View raw message