ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yusaku Sako" <yus...@hortonworks.com>
Subject Re: Review Request 27754: Hadoop install with yum timesout after 10 mins
Date Sat, 08 Nov 2014 00:20:09 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27754/#review60429
-----------------------------------------------------------

Ship it!


Ship It!

- Yusaku Sako


On Nov. 7, 2014, 11:42 p.m., Alejandro Fernandez wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27754/
> -----------------------------------------------------------
> 
> (Updated Nov. 7, 2014, 11:42 p.m.)
> 
> 
> Review request for Ambari, Mahadev Konar, Sumit Mohanty, Sid Wagle, and Yusaku Sako.
> 
> 
> Bugs: AMBARI-8220
>     https://issues.apache.org/jira/browse/AMBARI-8220
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Very often install fails due to timeout installing hadoop_2_2* packages, which can take
up to 8-12 mins.
> 
> Each service has a metainfo.xml file that defines the timeout for each Component for
all types of actions (e.g., INSTALL, START, CONFIGURE, STOP).
> 
> Ambari doesn't currently have a mechanism to set a different timeout just for the INSTALL
operation, so instead, the server side java code can do the following:
> 
> Get the default agent timeout from the ambari.properties file (which will be increased
from 10 mins to 15 mins)
> 
> Get the service component's timeout if it exists. If the operation is an INSTALL and
service component timeout is less than the default timeout, then use the default timeout.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/PythonExecutor.py 874b70b 
>   ambari-server/conf/unix/ambari.properties 8563cf2 
>   ambari-server/src/main/java/org/apache/ambari/server/configuration/Configuration.java
a0d5b39 
>   ambari-server/src/main/java/org/apache/ambari/server/controller/AmbariManagementControllerImpl.java
4f69dbb 
> 
> Diff: https://reviews.apache.org/r/27754/diff/
> 
> 
> Testing
> -------
> 
> ----------------------------------------------------------------------
> Total run:693
> Total errors:0
> Total failures:0
> OK
> 
> 
> Created an HDP 2.2 cluster with just HDFS and ZK, and then changed the timeouts as follows,
> 
> 
> yes | cp /vagrant/ambari/ambari-agent/src/main/python/ambari_agent/PythonExecutor.py
 /usr/lib/python2.6/site-packages/ambari_server/PythonExecutor.py
> yes | cp /vagrant/ambari/ambari-agent/src/main/python/ambari_agent/PythonExecutor.py
 /usr/lib/python2.6/site-packages/ambari_agent/PythonExecutor.py
> yes | cp /vagrant/ambari/ambari-server/target/ambari-server-*.jar                   
 /usr/lib/ambari-server/ambari-server-*.jar
> 
> Edited /etc/ambari-server/conf/ambari.properties and changed the agent.task.timeout value
from 600 to 900.
> 
> Then modified the ResourceManager and NodeManager timeouts in /var/lib/ambari-agent/cache/stacks/HDP/2.0.6/services/HDFS/metainfo.xml
as follows,
> ResourceManager         <timeout>642</timeout>
> NodeManager             <timeout>1042</timeout>
> 
> Then ran ambari-server restart
> 
> Upon adding the YARN services and inspecting the command-*.json files, they had,
> 
> 
>     "commandParams": {
>         "command_timeout": "1042",
>         "script": "scripts/nodemanager.py",
>         "script_type": "PYTHON",
>         "service_package_folder": "HDP/2.0.6/services/YARN/package",
>         "hooks_folder": "HDP/2.0.6/hooks"
>     },
>     
>         "commandParams": {
>         "command_timeout": "900",
>         "script": "scripts/resourcemanager.py",
>         "script_type": "PYTHON",
>         "service_package_folder": "HDP/2.0.6/services/YARN/package",
>         "hooks_folder": "HDP/2.0.6/hooks"
>     },
> 
> 
> Notice that the resource manager initially had a value less than the agent default, so
it was increased to it.
> 
> 
> Thanks,
> 
> Alejandro Fernandez
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message