hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: hftp in Hadoop 0.20.2
Date Sun, 12 Aug 2012 02:06:08 GMT
Jian,

Do not rely on dfs.info.port, it is a deprecated property and does not
exist anymore in 2.x releases. Rely instead on the fuller
dfs.http.address in 1.x and dfs.namenode.http.address in 2.x.

On Sat, Aug 11, 2012 at 3:45 AM, Jian Fang
<jian.fang.subscribe@gmail.com> wrote:
> Thanks Joey for the clarification. I will ask our hadoop admin to change
> that.
> But it would be great if this could be mentioned in the distcp document.
>
> Thanks,
>
> Jian
>
>
> On Fri, Aug 10, 2012 at 6:06 PM, Joey Echeverria <joey@cloudera.com> wrote:
>>
>> Yes, the dfs.info.port controls the HTTP port of the NN, including for
>> HFTP.
>>
>> You should make sure that your settings for dfs.http.address and
>> dfs.info.port are in sync. So change one of those to match the port
>> number of the other.
>>
>> -Joey
>>
>> On Fri, Aug 10, 2012 at 5:41 PM, Jian Fang
>> <jian.fang.subscribe@gmail.com> wrote:
>> > Hi Joey,
>> >
>> > I run the following command and got the jetty port as 8023.
>> >
>> >  $ grep "Jetty bound to port"
>> > hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log*
>> >
>> > hadoop-hadoop-namenode-pnjhadoopnn01.barnesandnoble.com.log.2012-04-07:2012-04-07
>> > 20:56:16,334 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port
>> > 8023
>> >
>> > Does this mean hftp is actually bound to port 8023?
>> >
>> > I am a bit confused. In hdfs-site.xml, we have the property defined as
>> > follows.
>> >
>> >
>> > <property>
>> >     <name>dfs.http.address</name>
>> >    <value>pnjhadoopnn01:50070</value>
>> > </property>
>> >
>> > and in core-site.xml, we have the following settings.
>> >
>> >   <property>
>> >     <name>fs.default.name</name>
>> >     <value>pnjhadoopnn01:8020</value>
>> >     <final>true</final>
>> >   </property>
>> >
>> >   <property>
>> >     <name>dfs.secondary.info.port</name>
>> >     <value>8022</value>
>> >   </property>
>> >   <property>
>> >     <name>dfs.info.port</name>
>> >     <value>8023</value>
>> >   </property>
>> >   <property>
>> >     <name>mapred.job.tracker.info.port</name>
>> >     <value>8024</value>
>> >   </property>
>> >   <property>
>> >     <name>tasktracker.http.port</name>
>> >     <value>8025</value>
>> >   </property>
>> >   <property>
>> >     <name>mapred.job.tracker.info.port</name>
>> >     <value>8024</value>
>> >   </property>
>> >
>> > Does this mean hadoop honors dfs.info.port over dfs.http.address?
>> >
>> > Thanks,
>> >
>> > Jian
>> >
>> > On Fri, Aug 10, 2012 at 5:08 PM, Joey Echeverria <joey@cloudera.com>
>> > wrote:
>> >>
>> >> Can you post your NN logs? It looks like the NN is not actually
>> >> started or is listening on another port for HTTP.
>> >>
>> >> -Joey
>> >>
>> >> On Fri, Aug 10, 2012 at 2:38 PM, Jian Fang
>> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> > Already did that. Connection was rejected.
>> >> >
>> >> >
>> >> > On Fri, Aug 10, 2012 at 2:24 PM, Joey Echeverria <joey@cloudera.com>
>> >> > wrote:
>> >> >>
>> >> >> Try:
>> >> >>
>> >> >> $ telnet pnjhadoopnn01 50070
>> >> >>
>> >> >> -Joey
>> >> >>
>> >> >> On Fri, Aug 10, 2012 at 1:10 PM, Jian Fang
>> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> > Here is the property in hdfs-site.xml
>> >> >> >
>> >> >> >    <property>
>> >> >> >       <name>dfs.http.address</name>
>> >> >> >       <value>pnjhadoopnn01:50070</value>
>> >> >> >    </property>
>> >> >> >
>> >> >> > Thanks,
>> >> >> >
>> >> >> > Jian
>> >> >> >
>> >> >> >
>> >> >> > On Fri, Aug 10, 2012 at 11:46 AM, Harsh J <harsh@cloudera.com>
>> >> >> > wrote:
>> >> >> >>
>> >> >> >> Yes the test was to figure out if there really was a listener
on
>> >> >> >> 50070. Can you check the hdfs-site.xml on the NN machine
for what
>> >> >> >> its
>> >> >> >> dfs.http.address may really be using for its port?
>> >> >> >>
>> >> >> >> On Fri, Aug 10, 2012 at 7:48 PM, Jian Fang
>> >> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> >> > Hi Harsh,
>> >> >> >> >
>> >> >> >> > Seems the -p requires the root privilege, which I
don't have. I
>> >> >> >> > run
>> >> >> >> > "netstat -a | grep 50070", but did not get back anything.
As I
>> >> >> >> > said,
>> >> >> >> > telnet
>> >> >> >> > did not work either.
>> >> >> >> >
>> >> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet  pnjhadoopnn01 50070
>> >> >> >> > Trying xx.xx.xx.xx...
>> >> >> >> > telnet: connect to address xx.xx.xx.xx: Connection
refused
>> >> >> >> > telnet: Unable to connect to remote host: Connection
refused
>> >> >> >> >
>> >> >> >> > [hadoop@pnjhadoopnn01 ~]$ telnet localhost 50070
>> >> >> >> > Trying 127.0.0.1...
>> >> >> >> > telnet: connect to address 127.0.0.1: Connection
refused
>> >> >> >> > telnet: Unable to connect to remote host: Connection
refused
>> >> >> >> >
>> >> >> >> > Thanks,
>> >> >> >> >
>> >> >> >> > Jian
>> >> >> >> >
>> >> >> >> > On Fri, Aug 10, 2012 at 1:50 AM, Harsh J <harsh@cloudera.com>
>> >> >> >> > wrote:
>> >> >> >> >>
>> >> >> >> >> Jian,
>> >> >> >> >>
>> >> >> >> >> From your NN, can you get us the output "netstat
-anp | grep
>> >> >> >> >> 50070"?
>> >> >> >> >>
>> >> >> >> >> On Fri, Aug 10, 2012 at 9:29 AM, Jian Fang
>> >> >> >> >> <jian.fang.subscribe@gmail.com> wrote:
>> >> >> >> >> > Thanks Harsh. But there is no firewall there,
the two
>> >> >> >> >> > clusters
>> >> >> >> >> > are
>> >> >> >> >> > on
>> >> >> >> >> > the
>> >> >> >> >> > same networks. I cannot telnet to the port
even on the same
>> >> >> >> >> > machine.
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >> > On Thu, Aug 9, 2012 at 6:00 PM, Harsh J
<harsh@cloudera.com>
>> >> >> >> >> > wrote:
>> >> >> >> >> >>
>> >> >> >> >> >> Hi Jian,
>> >> >> >> >> >>
>> >> >> >> >> >> HFTP is always-on by default. Can you
check and make sure
>> >> >> >> >> >> that
>> >> >> >> >> >> the
>> >> >> >> >> >> firewall isn't the cause of the connection
refused on port
>> >> >> >> >> >> 50070
>> >> >> >> >> >> on
>> >> >> >> >> >> the NN and ports 50075 on the DNs here?
>> >> >> >> >> >>
>> >> >> >> >> >> On Fri, Aug 10, 2012 at 1:47 AM, Jian
Fang
>> >> >> >> >> >> <jian.fang.subscribe@gmail.com>
wrote:
>> >> >> >> >> >> > Hi,
>> >> >> >> >> >> >
>> >> >> >> >> >> > We have a hadoop cluster of version
0.20.2 in production.
>> >> >> >> >> >> > Now
>> >> >> >> >> >> > we
>> >> >> >> >> >> > have
>> >> >> >> >> >> > another new Hadoop cluster using
cloudera's CDH3U4. We
>> >> >> >> >> >> > like
>> >> >> >> >> >> > to
>> >> >> >> >> >> > run
>> >> >> >> >> >> > distcp to
>> >> >> >> >> >> > copy files between the two clusters.
Since the hadoop
>> >> >> >> >> >> > versions
>> >> >> >> >> >> > are
>> >> >> >> >> >> > different, we have to use hftp
protocol to copy files
>> >> >> >> >> >> > based
>> >> >> >> >> >> > on
>> >> >> >> >> >> > the
>> >> >> >> >> >> > hadoop
>> >> >> >> >> >> > document here:
>> >> >> >> >> >> >
>> >> >> >> >> >> >
>> >> >> >> >> >> > http://hadoop.apache.org/common/docs/r0.20.2/distcp.html#cpver.
>> >> >> >> >> >> >
>> >> >> >> >> >> > The problem is that I cannot access
files via hftp from
>> >> >> >> >> >> > the
>> >> >> >> >> >> > current
>> >> >> >> >> >> > production 0.20.2 cluster even
though I can see the
>> >> >> >> >> >> > following
>> >> >> >> >> >> > setting
>> >> >> >> >> >> > from
>> >> >> >> >> >> > job tracker UI.
>> >> >> >> >> >> >
>> >> >> >> >> >> > dfs.http.address pnjhadoopnn01:50070
>> >> >> >> >> >> >
>> >> >> >> >> >> > I tried to telnet this port, but
got "connection refused"
>> >> >> >> >> >> > error.
>> >> >> >> >> >> > Seems
>> >> >> >> >> >> > the
>> >> >> >> >> >> > hftp service is not actually running.
Could someone tell
>> >> >> >> >> >> > me
>> >> >> >> >> >> > how
>> >> >> >> >> >> > to
>> >> >> >> >> >> > enable
>> >> >> >> >> >> > the hftp service in the 0.20.2
hadoop cluster so that I
>> >> >> >> >> >> > can
>> >> >> >> >> >> > run
>> >> >> >> >> >> > distcp?
>> >> >> >> >> >> >
>> >> >> >> >> >> > Thanks in advance,
>> >> >> >> >> >> >
>> >> >> >> >> >> > John
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >>
>> >> >> >> >> >> --
>> >> >> >> >> >> Harsh J
>> >> >> >> >> >
>> >> >> >> >> >
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >>
>> >> >> >> >> --
>> >> >> >> >> Harsh J
>> >> >> >> >
>> >> >> >> >
>> >> >> >>
>> >> >> >>
>> >> >> >>
>> >> >> >> --
>> >> >> >> Harsh J
>> >> >> >
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Joey Echeverria
>> >> >> Principal Solutions Architect
>> >> >> Cloudera, Inc.
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Joey Echeverria
>> >> Principal Solutions Architect
>> >> Cloudera, Inc.
>> >
>> >
>>
>>
>>
>> --
>> Joey Echeverria
>> Principal Solutions Architect
>> Cloudera, Inc.
>
>



-- 
Harsh J

Mime
View raw message