Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 55F7DDF59 for ; Fri, 10 Aug 2012 22:15:51 +0000 (UTC) Received: (qmail 64542 invoked by uid 500); 10 Aug 2012 22:15:46 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 64415 invoked by uid 500); 10 Aug 2012 22:15:46 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 64408 invoked by uid 99); 10 Aug 2012 22:15:46 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Aug 2012 22:15:46 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jian.fang.subscribe@gmail.com designates 209.85.160.48 as permitted sender) Received: from [209.85.160.48] (HELO mail-pb0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 10 Aug 2012 22:15:41 +0000 Received: by pbbrp16 with SMTP id rp16so2890819pbb.35 for ; Fri, 10 Aug 2012 15:15:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=OuoqaczrTsPOSN9eTcsCsawikpMu/Mb0ZNBjL2pEiJc=; b=pIitTxkiIiwxL3mnsF9AjpW9Vpm9FoqfSy1Fk3nG6u1rumrXHCDB7Qs38b6qW7IpeS cIK2/KEQW02P7bGtoWssux2gNNJ9tesrJ4d4zHFuv7FBRKMliklB7HaJS1UVwqBELxFT dBD+oVTddNHAfifJa+NYbk/5Z0RqHTli7grpyGp0DGLJLRFZ+A9nS2C6ALYvAg9VyN/i J8m3AxzidFqz4CvyDEsaigqfJSHz5kNzaFtxWraNOV6HEG0S2aYvgYUEjPWHpcyG67ES NPc3l24HTLHpqy6e3QL0IIJhPZqQC6Gpa8DD/OJEPnQKbzmATDYxjmLdfAHEM5+k4gQF e18g== MIME-Version: 1.0 Received: by 10.68.228.231 with SMTP id sl7mr2211975pbc.45.1344636921233; Fri, 10 Aug 2012 15:15:21 -0700 (PDT) Received: by 10.68.44.134 with HTTP; Fri, 10 Aug 2012 15:15:21 -0700 (PDT) In-Reply-To: References: Date: Fri, 10 Aug 2012 18:15:21 -0400 Message-ID: Subject: Re: hftp in Hadoop 0.20.2 From: Jian Fang To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b2e0a0b0ac18704c6f0ae20 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b2e0a0b0ac18704c6f0ae20 Content-Type: text/plain; charset=ISO-8859-1 Thanks Joey for the clarification. I will ask our hadoop admin to change that. But it would be great if this could be mentioned in the distcp document. Thanks, Jian On Fri, Aug 10, 2012 at 6:06 PM, Joey Echeverria wrote: > Yes, the dfs.info.port controls the HTTP port of the NN, including for > HFTP. > > You should make sure that your settings for dfs.http.address and > dfs.info.port are in sync. So change one of those to match the port > number of the other. > > -Joey > > On Fri, Aug 10, 2012 at 5:41 PM, Jian Fang > wrote: > > Hi Joey, > > > > I run the following command and got the jetty port as 8023. > > > > $ grep "Jetty bound to port" > > hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log* > > > hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log.2012-04-07:2012-04-07 > > 20:56:16,334 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port > > 8023 > > > > Does this mean hftp is actually bound to port 8023? > > > > I am a bit confused. In hdfs-site.xml, we have the property defined as > > follows. > > > > > > > > dfs.http.address > > pnjhadoopnn01:50070 > > > > > > and in core-site.xml, we have the following settings. > > > > > > fs.default.name > > pnjhadoopnn01:8020 > > true > > > > > > > > dfs.secondary.info.port > > 8022 > > > > > > dfs.info.port > > 8023 > > > > > > mapred.job.tracker.info.port > > 8024 > > > > > > tasktracker.http.port > > 8025 > > > > > > mapred.job.tracker.info.port > > 8024 > > > > > > Does this mean hadoop honors dfs.info.port over dfs.http.address? > > > > Thanks, > > > > Jian > > > > On Fri, Aug 10, 2012 at 5:08 PM, Joey Echeverria > wrote: > >> > >> Can you post your NN logs? It looks like the NN is not actually > >> started or is listening on another port for HTTP. > >> > >> -Joey > >> > >> On Fri, Aug 10, 2012 at 2:38 PM, Jian Fang > >> wrote: > >> > Already did that. Connection was rejected. > >> > > >> > > >> > On Fri, Aug 10, 2012 at 2:24 PM, Joey Echeverria > >> > wrote: > >> >> > >> >> Try: > >> >> > >> >> $ telnet pnjhadoopnn01 50070 > >> >> > >> >> -Joey > >> >> > >> >> On Fri, Aug 10, 2012 at 1:10 PM, Jian Fang > >> >> wrote: > >> >> > Here is the property in hdfs-site.xml > >> >> > > >> >> > > >> >> > dfs.http.address > >> >> > pnjhadoopnn01:50070 > >> >> > > >> >> > > >> >> > Thanks, > >> >> > > >> >> > Jian > >> >> > > >> >> > > >> >> > On Fri, Aug 10, 2012 at 11:46 AM, Harsh J > wrote: > >> >> >> > >> >> >> Yes the test was to figure out if there really was a listener on > >> >> >> 50070. Can you check the hdfs-site.xml on the NN machine for what > >> >> >> its > >> >> >> dfs.http.address may really be using for its port? > >> >> >> > >> >> >> On Fri, Aug 10, 2012 at 7:48 PM, Jian Fang > >> >> >> wrote: > >> >> >> > Hi Harsh, > >> >> >> > > >> >> >> > Seems the -p requires the root privilege, which I don't have. I > >> >> >> > run > >> >> >> > "netstat -a | grep 50070", but did not get back anything. As I > >> >> >> > said, > >> >> >> > telnet > >> >> >> > did not work either. > >> >> >> > > >> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet pnjhadoopnn01 50070 > >> >> >> > Trying xx.xx.xx.xx... > >> >> >> > telnet: connect to address xx.xx.xx.xx: Connection refused > >> >> >> > telnet: Unable to connect to remote host: Connection refused > >> >> >> > > >> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet localhost 50070 > >> >> >> > Trying 127.0.0.1... > >> >> >> > telnet: connect to address 127.0.0.1: Connection refused > >> >> >> > telnet: Unable to connect to remote host: Connection refused > >> >> >> > > >> >> >> > Thanks, > >> >> >> > > >> >> >> > Jian > >> >> >> > > >> >> >> > On Fri, Aug 10, 2012 at 1:50 AM, Harsh J > >> >> >> > wrote: > >> >> >> >> > >> >> >> >> Jian, > >> >> >> >> > >> >> >> >> From your NN, can you get us the output "netstat -anp | grep > >> >> >> >> 50070"? > >> >> >> >> > >> >> >> >> On Fri, Aug 10, 2012 at 9:29 AM, Jian Fang > >> >> >> >> wrote: > >> >> >> >> > Thanks Harsh. But there is no firewall there, the two > clusters > >> >> >> >> > are > >> >> >> >> > on > >> >> >> >> > the > >> >> >> >> > same networks. I cannot telnet to the port even on the same > >> >> >> >> > machine. > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > On Thu, Aug 9, 2012 at 6:00 PM, Harsh J > >> >> >> >> > wrote: > >> >> >> >> >> > >> >> >> >> >> Hi Jian, > >> >> >> >> >> > >> >> >> >> >> HFTP is always-on by default. Can you check and make sure > that > >> >> >> >> >> the > >> >> >> >> >> firewall isn't the cause of the connection refused on port > >> >> >> >> >> 50070 > >> >> >> >> >> on > >> >> >> >> >> the NN and ports 50075 on the DNs here? > >> >> >> >> >> > >> >> >> >> >> On Fri, Aug 10, 2012 at 1:47 AM, Jian Fang > >> >> >> >> >> wrote: > >> >> >> >> >> > Hi, > >> >> >> >> >> > > >> >> >> >> >> > We have a hadoop cluster of version 0.20.2 in production. > >> >> >> >> >> > Now > >> >> >> >> >> > we > >> >> >> >> >> > have > >> >> >> >> >> > another new Hadoop cluster using cloudera's CDH3U4. We > like > >> >> >> >> >> > to > >> >> >> >> >> > run > >> >> >> >> >> > distcp to > >> >> >> >> >> > copy files between the two clusters. Since the hadoop > >> >> >> >> >> > versions > >> >> >> >> >> > are > >> >> >> >> >> > different, we have to use hftp protocol to copy files > based > >> >> >> >> >> > on > >> >> >> >> >> > the > >> >> >> >> >> > hadoop > >> >> >> >> >> > document here: > >> >> >> >> >> > > >> >> >> >> >> > > http://hadoop.apache.org/common/docs/r0.20.2/distcp.html#cpver. > >> >> >> >> >> > > >> >> >> >> >> > The problem is that I cannot access files via hftp from > the > >> >> >> >> >> > current > >> >> >> >> >> > production 0.20.2 cluster even though I can see the > >> >> >> >> >> > following > >> >> >> >> >> > setting > >> >> >> >> >> > from > >> >> >> >> >> > job tracker UI. > >> >> >> >> >> > > >> >> >> >> >> > dfs.http.address pnjhadoopnn01:50070 > >> >> >> >> >> > > >> >> >> >> >> > I tried to telnet this port, but got "connection refused" > >> >> >> >> >> > error. > >> >> >> >> >> > Seems > >> >> >> >> >> > the > >> >> >> >> >> > hftp service is not actually running. Could someone tell > me > >> >> >> >> >> > how > >> >> >> >> >> > to > >> >> >> >> >> > enable > >> >> >> >> >> > the hftp service in the 0.20.2 hadoop cluster so that I > can > >> >> >> >> >> > run > >> >> >> >> >> > distcp? > >> >> >> >> >> > > >> >> >> >> >> > Thanks in advance, > >> >> >> >> >> > > >> >> >> >> >> > John > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> > >> >> >> >> >> -- > >> >> >> >> >> Harsh J > >> >> >> >> > > >> >> >> >> > > >> >> >> >> > >> >> >> >> > >> >> >> >> > >> >> >> >> -- > >> >> >> >> Harsh J > >> >> >> > > >> >> >> > > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Harsh J > >> >> > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> Joey Echeverria > >> >> Principal Solutions Architect > >> >> Cloudera, Inc. > >> > > >> > > >> > >> > >> > >> -- > >> Joey Echeverria > >> Principal Solutions Architect > >> Cloudera, Inc. > > > > > > > > -- > Joey Echeverria > Principal Solutions Architect > Cloudera, Inc. > --047d7b2e0a0b0ac18704c6f0ae20 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Joey for the clarification. I will ask our hadoop admin to change th= at.
But it would be great if this could be mentioned in the distcp docum= ent.

Thanks,

Jian

On Fri, A= ug 10, 2012 at 6:06 PM, Joey Echeverria <joey@cloudera.com> = wrote:
Yes, the dfs.info.port controls the HTTP por= t of the NN, including for HFTP.

You should make sure that your settings for dfs.http.address and
dfs.info.port are in sync. So change one of those to match the port
number of the other.

-Joey

On Fri, Aug 10, 2012 at 5:41 PM, Jian Fang
<jian.fang.subscribe@gmail.com> wrote:
> Hi Joey,
>
> I run the following command and got the jetty port as 8023.
>
> =A0$ grep "Jetty bound to port"
> hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log*
> hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log.2012-04-07= :2012-04-07
> 20:56:16,334 INFO org.apache.hadoop.http.HttpServer: Jetty bound to po= rt
> 8023
>
> Does this mean hftp is actually bound to port 8023?
>
> I am a bit confused. In hdfs-site.xml, we have the property defined as=
> follows.
>
>
> <property>
> =A0 =A0 <name>dfs.http.address</name>
> =A0 =A0<value>pnjhadoopnn01:50070</value>
> </property>
>
> and in core-site.xml, we have the following settings.
>
> =A0 <property>
> =A0 =A0 <name>fs.default.name</name>
> =A0 =A0 <value>pnjhadoopnn01:8020</value>
> =A0 =A0 <final>true</final>
> =A0 </property>
>
> =A0 <property>
> =A0 =A0 <name>dfs.secondary.info.port</name>
> =A0 =A0 <value>8022</value>
> =A0 </property>
> =A0 <property>
> =A0 =A0 <name>dfs.info.port</name>
> =A0 =A0 <value>8023</value>
> =A0 </property>
> =A0 <property>
> =A0 =A0 <name>mapred.job.tracker.info.port</name>
> =A0 =A0 <value>8024</value>
> =A0 </property>
> =A0 <property>
> =A0 =A0 <name>tasktracker.http.port</name>
> =A0 =A0 <value>8025</value>
> =A0 </property>
> =A0 <property>
> =A0 =A0 <name>mapred.job.tracker.info.port</name>
> =A0 =A0 <value>8024</value>
> =A0 </property>
>
> Does this mean hadoop honors dfs.info.port over dfs.http.address?
>
> Thanks,
>
> Jian
>
> On Fri, Aug 10, 2012 at 5:08 PM, Joey Echeverria <joey@cloudera.com> wrote:
>>
>> Can you post your NN logs? It looks like the NN is not actually >> started or is listening on another port for HTTP.
>>
>> -Joey
>>
>> On Fri, Aug 10, 2012 at 2:38 PM, Jian Fang
>> <jian.fang.sub= scribe@gmail.com> wrote:
>> > Already did that. Connection was rejected.
>> >
>> >
>> > On Fri, Aug 10, 2012 at 2:24 PM, Joey Echeverria <joey@cloudera.com>
>> > wrote:
>> >>
>> >> Try:
>> >>
>> >> $ telnet pnjhadoopnn01 50070
>> >>
>> >> -Joey
>> >>
>> >> On Fri, Aug 10, 2012 at 1:10 PM, Jian Fang
>> >> <jian= .fang.subscribe@gmail.com> wrote:
>> >> > Here is the property in hdfs-site.xml
>> >> >
>> >> > =A0 =A0<property>
>> >> > =A0 =A0 =A0 <name>dfs.http.address</name>= ;
>> >> > =A0 =A0 =A0 <value>pnjhadoopnn01:50070</val= ue>
>> >> > =A0 =A0</property>
>> >> >
>> >> > Thanks,
>> >> >
>> >> > Jian
>> >> >
>> >> >
>> >> > On Fri, Aug 10, 2012 at 11:46 AM, Harsh J <harsh@cloudera.com> wrote:
>> >> >>
>> >> >> Yes the test was to figure out if there really w= as a listener on
>> >> >> 50070. Can you check the hdfs-site.xml on the NN= machine for what
>> >> >> its
>> >> >> dfs.http.address may really be using for its por= t?
>> >> >>
>> >> >> On Fri, Aug 10, 2012 at 7:48 PM, Jian Fang
>> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> > Hi Harsh,
>> >> >> >
>> >> >> > Seems the -p requires the root privilege, w= hich I don't have. I
>> >> >> > run
>> >> >> > "netstat -a | grep 50070", but di= d not get back anything. As I
>> >> >> > said,
>> >> >> > telnet
>> >> >> > did not work either.
>> >> >> >
>> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet =A0pnjhado= opnn01 50070
>> >> >> > Trying xx.xx.xx.xx...
>> >> >> > telnet: connect to address xx.xx.xx.xx: Con= nection refused
>> >> >> > telnet: Unable to connect to remote host: C= onnection refused
>> >> >> >
>> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet localhost = 50070
>> >> >> > Trying 127.0.0.1...
>> >> >> > telnet: connect to address 127.0.0.1: Connection refused
>> >> >> > telnet: Unable to connect to remote host: C= onnection refused
>> >> >> >
>> >> >> > Thanks,
>> >> >> >
>> >> >> > Jian
>> >> >> >
>> >> >> > On Fri, Aug 10, 2012 at 1:50 AM, Harsh J &l= t;harsh@cloudera.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Jian,
>> >> >> >>
>> >> >> >> From your NN, can you get us the output= "netstat -anp | grep
>> >> >> >> 50070"?
>> >> >> >>
>> >> >> >> On Fri, Aug 10, 2012 at 9:29 AM, Jian F= ang
>> >> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> >> > Thanks Harsh. But there is no fire= wall there, the two clusters
>> >> >> >> > are
>> >> >> >> > on
>> >> >> >> > the
>> >> >> >> > same networks. I cannot telnet to = the port even on the same
>> >> >> >> > machine.
>> >> >> >> >
>> >> >> >> >
>> >> >> >> > On Thu, Aug 9, 2012 at 6:00 PM, Ha= rsh J <harsh@cloudera.com><= br> >> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Hi Jian,
>> >> >> >> >>
>> >> >> >> >> HFTP is always-on by default. = Can you check and make sure that
>> >> >> >> >> the
>> >> >> >> >> firewall isn't the cause o= f the connection refused on port
>> >> >> >> >> 50070
>> >> >> >> >> on
>> >> >> >> >> the NN and ports 50075 on the = DNs here?
>> >> >> >> >>
>> >> >> >> >> On Fri, Aug 10, 2012 at 1:47 A= M, Jian Fang
>> >> >> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> >> >> > Hi,
>> >> >> >> >> >
>> >> >> >> >> > We have a hadoop cluster = of version 0.20.2 in production.
>> >> >> >> >> > Now
>> >> >> >> >> > we
>> >> >> >> >> > have
>> >> >> >> >> > another new Hadoop cluste= r using cloudera's CDH3U4. We like
>> >> >> >> >> > to
>> >> >> >> >> > run
>> >> >> >> >> > distcp to
>> >> >> >> >> > copy files between the tw= o clusters. Since the hadoop
>> >> >> >> >> > versions
>> >> >> >> >> > are
>> >> >> >> >> > different, we have to use= hftp protocol to copy files based
>> >> >> >> >> > on
>> >> >> >> >> > the
>> >> >> >> >> > hadoop
>> >> >> >> >> > document here:
>> >> >> >> >> >
>> >> >> >> >> > http://= hadoop.apache.org/common/docs/r0.20.2/distcp.html#cpver.
>> >> >> >> >> >
>> >> >> >> >> > The problem is that I can= not access files via hftp from the
>> >> >> >> >> > current
>> >> >> >> >> > production 0.20.2 cluster= even though I can see the
>> >> >> >> >> > following
>> >> >> >> >> > setting
>> >> >> >> >> > from
>> >> >> >> >> > job tracker UI.
>> >> >> >> >> >
>> >> >> >> >> > dfs.http.address pnjhadoo= pnn01:50070
>> >> >> >> >> >
>> >> >> >> >> > I tried to telnet this po= rt, but got "connection refused"
>> >> >> >> >> > error.
>> >> >> >> >> > Seems
>> >> >> >> >> > the
>> >> >> >> >> > hftp service is not actua= lly running. Could someone tell me
>> >> >> >> >> > how
>> >> >> >> >> > to
>> >> >> >> >> > enable
>> >> >> >> >> > the hftp service in the 0= .20.2 hadoop cluster so that I can
>> >> >> >> >> > run
>> >> >> >> >> > distcp?
>> >> >> >> >> >
>> >> >> >> >> > Thanks in advance,
>> >> >> >> >> >
>> >> >> >> >> > John
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Harsh J
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Harsh J
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Harsh J
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Joey Echeverria
>> >> Principal Solutions Architect
>> >> Cloudera, Inc.
>> >
>> >
>>
>>
>>
>> --
>> Joey Echeverria
>> Principal Solutions Architect
>> Cloudera, Inc.
>
>



--
Joey Echeverria
Principal Solutions Architect
Cloudera, Inc.

--047d7b2e0a0b0ac18704c6f0ae20--