Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5B84511919 for ; Thu, 11 Sep 2014 08:14:29 +0000 (UTC) Received: (qmail 85830 invoked by uid 500); 11 Sep 2014 08:14:24 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 85714 invoked by uid 500); 11 Sep 2014 08:14:24 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 85703 invoked by uid 99); 11 Sep 2014 08:14:24 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Sep 2014 08:14:24 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jarihd@gmail.com designates 209.85.219.46 as permitted sender) Received: from [209.85.219.46] (HELO mail-oa0-f46.google.com) (209.85.219.46) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Sep 2014 08:14:19 +0000 Received: by mail-oa0-f46.google.com with SMTP id eb12so4914079oac.19 for ; Thu, 11 Sep 2014 01:13:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=B2CAnawkKxUi/i+iWbQcPIlva8dOzrz+VqTlt/3fGrw=; b=uX1XFFmyHy7Uuxr0EF+qgXendiT+iftlp1zB1zYIpHaHNobrUquOskpgaweZ05PP+9 J+ItRNldlhXcQ0j+9oSp7V7SZiJsbXXYpYSl2it6ULb92fjgSKY5nYfKdl+QpsQTnPmo TvjkkUoAK0542hmPZkXLJ6jM3/Zv+Xu9M2hHTzMtZDVwL5niJWhz+X1jyt2qCig3thYa pCwFiMJcXH0pvZyn6A9KF2xluyNLH+K6Qx88O4vWZ4c/q3eQYHUcZzACs+bywnojfK2c roGgH6Vqka/DO88q8lvicYOeQa1VsUnecfz60ZSAIjqT2LFEkmm9ICThc3n5Lnny6/ps IlYg== MIME-Version: 1.0 X-Received: by 10.60.220.169 with SMTP id px9mr939277oec.67.1410423239281; Thu, 11 Sep 2014 01:13:59 -0700 (PDT) Received: by 10.182.60.67 with HTTP; Thu, 11 Sep 2014 01:13:59 -0700 (PDT) In-Reply-To: References: Date: Thu, 11 Sep 2014 13:43:59 +0530 Message-ID: Subject: Re: S3 with Hadoop 2.5.0 - Not working From: Dhiraj To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a1133e0e42911900502c5c08f X-Virus-Checked: Checked by ClamAV on apache.org --001a1133e0e42911900502c5c08f Content-Type: text/plain; charset=UTF-8 Hi Harsh, I am a newbie to hadoop. I am able to start the nodes with hadoop 1.1.x and hadoop 1.2.x versions for the following property; but not with 2.5.0(fs.defaultFS) for 1.*.* releases i dont need to specify hdfs:// like you suggested; it works with s3:// fs.default.name s3://bucket1 Is there any configuration that i need to do with 2.5.0 release. classpath; etc. Also how do i debug a command like "hadoop fs -ls s3://bucket1/" - is there any way of increasing the log level. I want to know what address and port is the following command resolving to. I have my s3 object store server running on some address and port; so would like to know if "hadoop fs -ls s3://" connects to amazon or to my server. I dont see any information in namenode/datanode logs. cheers, Dhiraj On Wed, Sep 10, 2014 at 3:13 PM, Harsh J wrote: > > Incorrect configuration: namenode address > dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not > configured. > > Starting namenodes on [] > > NameNode/DataNode are part of a HDFS service. It makes no sense to try > and run them over an S3 URL default, which is a distributed filesystem > in itself. The services need fs.defaultFS to be set to a HDFS URI to > be able to start up. > > > but unable to get an s3 config started via hadoop > > You can run jobs over S3 input and output data by running a regular MR > cluster on HDFS - just pass the right URI as input and output > parameters of the job. Set your S3 properties in core-site.xml but let > the fs.defaultFS be of HDFS type, to do this. > > > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release > support s3 or do i need to do anything else. > > In Apache Hadoop 2 we dynamically load the FS classes, so we do not > need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1. > > On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj wrote: > > Hi, > > > > I have downloaded hadoop-2.5.0 and am trying to get it working for s3 > > backend (single-node in a pseudo-distributed mode). > > I have made changes to the core-site.xml according to > > https://wiki.apache.org/hadoop/AmazonS3 > > > > I have an backend object store running on my machine that supports S3. > > > > I get the following message when i try to start the daemons > > Incorrect configuration: namenode address > dfs.namenode.servicerpc-address or > > dfs.namenode.rpc-address is not configured. > > > > > > root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh > > Incorrect configuration: namenode address > dfs.namenode.servicerpc-address or > > dfs.namenode.rpc-address is not configured. > > Starting namenodes on [] > > localhost: starting namenode, logging to > > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out > > localhost: starting datanode, logging to > > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out > > Starting secondary namenodes [0.0.0.0] > > 0.0.0.0: starting secondarynamenode, logging to > > /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.out > > root@ubuntu:/build/hadoop/hadoop-2.5.0# > > > > The deamons dont start after the above. > > i get the same error if i add the property "fs.defaultFS" and set its > value > > to the s3 bucket but if i change the defaultFS to hdfs:// it works fine > - am > > able to launch the daemons. > > > > my core-site.xml: > > > > > > fs.defaultFS > > s3://bucket1 > > > > > > fs.s3.awsAccessKeyId > > abcd > > > > > > fs.s3.awsSecretAccessKey > > 1234 > > > > > > > > > > I am able to list the buckets and its contents via s3cmd and boto; but > > unable to get an s3 config started via hadoop > > > > Also from the following core-file.xml listed on the website; i dont see > an > > implementation for s3 > > > http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xml > > > > There is an s3.impl until 1.2.1 release. So does the 2.5.0 release > support > > s3 or do i need to do anything else. > > > > cheers, > > Dhiraj > > > > > > > > > > -- > Harsh J > --001a1133e0e42911900502c5c08f Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi Harsh,

I am a newbie to h= adoop.
I am able to start the nodes with hadoop 1.1.x =C2=A0a= nd hadoop 1.2.x versions for the following property; but not with 2.5.0(fs.= defaultFS)
for 1.*.* releases i dont need to specify hdfs:// = like you suggested; it works with s3://

<property>
=C2=A0 =C2=A0 =C2=A0= =C2=A0 <name>fs.default.name&= lt;/name>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 <value>s3://bucket1</value>
=
=C2=A0 =C2=A0 &l= t;/property>

Is there any configuration t= hat i need to do with 2.5.0 release. classpath; etc.

Also how do i debug a command like "hadoop fs -ls s3://bucket1/&quo= t; - is there any way of increasing the log level. I want to know what addr= ess and port is the following command resolving to. I have my s3 object sto= re server running on some address and port; so would like to know if "= hadoop fs -ls s3://" connects to amazon or to my server.=C2=A0I dont see any information in namenode/datanode logs.

cheers,
Dhiraj






On Wed, Sep 10, 2014 at 3:13 PM, Harsh J <harsh@= cloudera.com> wrote:
>= ; Incorrect configuration: namenode address dfs.namenode.servicerpc-address= or dfs.namenode.rpc-address is not configured.
> Starting namenodes on []

NameNode/DataNode are part of a HDFS service. It makes no sense to t= ry
and run them over an S3 URL default, which is a distributed filesystem
in itself. The services need fs.defaultFS to be set to a HDFS URI to
be able to start up.

> but unable to get an s3 config started via hadoop

You can run jobs over S3 input and output data by running a regular = MR
cluster on HDFS - just pass the right URI as input and output
parameters of the job. Set your S3 properties in core-site.xml but let
the fs.defaultFS be of HDFS type, to do this.

> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release sup= port s3 or do i need to do anything else.

In Apache Hadoop 2 we dynamically load the FS classes, so we do not<= br> need the fs.NAME.impl configs anymore as we did in Apache Hadoop 1.

On Wed, Sep 10, 2014 at 1:15 PM, Dhiraj <jarihd@gmail.com> wrote:
> Hi,
>
> I have downloaded hadoop-2.5.0 and am trying to get it working for s3<= br> > backend (single-node in a pseudo-distributed mode).
> I have made changes to the core-site.xml according to
> = https://wiki.apache.org/hadoop/AmazonS3
>
> I have an backend object store running on my machine that supports S3.=
>
> I get the following message when i try to start the daemons
> Incorrect configuration: namenode address dfs.namenode.servicerpc-addr= ess or
> dfs.namenode.rpc-address is not configured.
>
>
> root@ubuntu:/build/hadoop/hadoop-2.5.0# ./sbin/start-dfs.sh
> Incorrect configuration: namenode address dfs.namenode.servicerpc-addr= ess or
> dfs.namenode.rpc-address is not configured.
> Starting namenodes on []
> localhost: starting namenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-namenode-ubuntu.out
> localhost: starting datanode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-datanode-ubuntu.out
> Starting secondary namenodes [0.0.0.0]
> 0.0.0.0: starting sec= ondarynamenode, logging to
> /build/hadoop/hadoop-2.5.0/logs/hadoop-root-secondarynamenode-ubuntu.o= ut
> root@ubuntu:/build/hadoop/hadoop-2.5.0#
>
> The deamons dont start after the above.
> i get the same error if i add the property "fs.defaultFS" an= d set its value
> to the s3 bucket but if i change the defaultFS to hdfs:// it works fin= e - am
> able to launch the daemons.
>
> my core-site.xml:
> <configuration>
>=C2=A0 =C2=A0 =C2=A0<property>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<name>fs.defaultFS</name>=
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<value>s3://bucket1</value&g= t;
>=C2=A0 =C2=A0 =C2=A0</property>
>=C2=A0 =C2=A0 =C2=A0<property>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<name>fs.s3.awsAccessKeyId</= name>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<value>abcd</value>
>=C2=A0 =C2=A0 =C2=A0</property>
>=C2=A0 =C2=A0 =C2=A0<property>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<name>fs.s3.awsSecretAccessKey&= lt;/name>
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0<value>1234</value>
>=C2=A0 =C2=A0 =C2=A0</property>
> </configuration>
>
>
> I am able to list the buckets and its contents via s3cmd and boto; but=
> unable to get an s3 config started via hadoop
>
> Also from the following core-file.xml listed on the website; i dont se= e an
> implementation for s3
> http://hadoop.apache.org/d= ocs/current/hadoop-project-dist/hadoop-common/core-default.xml
>
> There is an s3.impl until 1.2.1 release. So does the 2.5.0 release sup= port
> s3 or do i need to do anything else.
>
> cheers,
> Dhiraj
>
>
>



--
Harsh J

--001a1133e0e42911900502c5c08f--