ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yusaku Sako (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (AMBARI-8244) Ambari HDP 2.0.6+ stacks do not work with fs.defaultFS not being hdfs
Date Sat, 28 Mar 2015 02:00:56 GMT

    [ https://issues.apache.org/jira/browse/AMBARI-8244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385052#comment-14385052
] 

Yusaku Sako edited comment on AMBARI-8244 at 3/28/15 2:00 AM:
--------------------------------------------------------------

The above Hadoop QA failure is unrelated to this patch.

When I tested AMBARI-8244.6.patch end-to-end on a real cluster, I've found the following:
* Cluster installation and service checks are fine.
* Post-install, I was able to move the NameNode on c6401 to c6402 and things were ok after
the move.
* When enabled NameNode HA, things seemed to have gone smoothly.  dfs.namenode.rpc-address
is still pointing to c6402, but *hadoop fs* commands to read/write on HDFS were running fine
after the NameNode on c6402 was shut down.
* HDFS service check failed because it ran "hadoop dfsadmin -safemode get" with "-fs" parameter
that specifies a specific NameNode based on dfs.namenode.rpc-address (this is due to changes
in the patch); if that NameNode is down, then the command would obviously fail.  So we need
to remove this property when NameNode HA is enabled.
* However, that might not be the full story.  *hadoop dfsadmin -safemode get*, even when triggered
from the command line caused issues; it still tried to access the NameNode that is down and
not the other one even without the "-fs" parameter.  I've hand edited hdfs-site.xml to remove
dfs.namenode.rpc-address, but "hadoop dfsadmin -safemode get" still tries to connect to the
NameNode that is down.  
{code}
safemode: Call From c6401.ambari.apache.org/192.168.64.101 to c6402.ambari.apache.org:8020
failed on connection exception: java.net.ConnectException: Connection refused; For more details
see:  http://wiki.apache.org/hadoop/ConnectionRefused
{code}
What's confusing is that the "


was (Author: u39kun):
The above Hadoop QA failure is unrelated to this patch.

When I tested AMBARI-8244.6.patch end-to-end on a real cluster, I've found the following:
* Cluster installation and service checks are fine.
* Post-install, I was able to move the NameNode on c6401 to c6402 and things were ok after
the move.
* When enabled NameNode HA, things seemed to have gone smoothly.  dfs.namenode.rpc-address
is still pointing to c6402, but it worked fine after NameNode on c6402 was shut down; *hadoop
fs* commands to read/write on HDFS were running fine.
* HDFS service check failed because it ran "hadoop dfsadmin -safemode get" with "-fs" parameter
that specifies a specific NameNode based on dfs.namenode.rpc-address (this is due to changes
in the patch); if that NameNode is down, then the command would obviously fail.  So we need
to remove this property when NameNode HA is enabled.
* However, that might not be the full story.  *hadoop dfsadmin -safemode get*, even when triggered
from the command line caused issues; it still tried to access the NameNode that is down and
not the other one even without the "-fs" parameter.  I've hand edited hdfs-site.xml to remove
dfs.namenode.rpc-address, but "hadoop dfsadmin -safemode get" still tries to connect to the
NameNode that is down.  
{code}
safemode: Call From c6401.ambari.apache.org/192.168.64.101 to c6402.ambari.apache.org:8020
failed on connection exception: java.net.ConnectException: Connection refused; For more details
see:  http://wiki.apache.org/hadoop/ConnectionRefused
{code}
What's confusing is that the "

> Ambari HDP 2.0.6+ stacks do not work with fs.defaultFS not being hdfs
> ---------------------------------------------------------------------
>
>                 Key: AMBARI-8244
>                 URL: https://issues.apache.org/jira/browse/AMBARI-8244
>             Project: Ambari
>          Issue Type: Bug
>          Components: stacks
>    Affects Versions: 2.0.0
>            Reporter: Ivan Mitic
>            Assignee: Ivan Mitic
>              Labels: HDP
>             Fix For: 2.1.0
>
>         Attachments: AMBARI-8244.2.patch, AMBARI-8244.3.patch, AMBARI-8244.4.patch, AMBARI-8244.5.patch,
AMBARI-8244.6.patch, AMBARI-8244.patch
>
>
> Right now changing the default file system does not work with the HDP 2.0.6+ stacks.
Given that it might be common to run HDP against some other file system in the cloud, adding
support for this will be super useful. One alternative is to consider a separate stack definition
for other file systems, however, given that I noticed just 2 minor bugs needed to support
this, I would rather extend on the existing code.
> Bugs:
>  - One issue is in Nagios install scripts, where it is assumed that fs.defaultFS has
the namenode port number.
>  - Another issue is in HDFS install scripts, where {{hadoop dfsadmin}} command only works
when hdfs is the default file system.
> Fix for both places is to extract the namenode address/port from {{dfs.namenode.rpc-address}}
if one is defined and use it instead of relying on {{fs.defaultFS}}. 
> Haven't included any tests yet (my first Ambari patch, not sure what is appropriate,
so please comment).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message