ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Myroslav Papirkovskyy" <mpapyrkovs...@hortonworks.com>
Subject Re: Review Request 42350: YARN service check was failed on cluster with enabled NN and Rm HAs
Date Fri, 15 Jan 2016 13:36:17 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42350/#review114705
-----------------------------------------------------------

Ship it!


Ship It!

- Myroslav Papirkovskyy


On Січ. 15, 2016, 3:31 після полудня, Andrew Onischuk wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42350/
> -----------------------------------------------------------
> 
> (Updated Січ. 15, 2016, 3:31 після полудня)
> 
> 
> Review request for Ambari, Myroslav Papirkovskyy and Vitalyi Brodetskyi.
> 
> 
> Bugs: AMBARI-14686
>     https://issues.apache.org/jira/browse/AMBARI-14686
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> Steps:
> 
>   1. Deployed cluster via UI.
>   2. Enabled NameNode HA.
>   3. Enabled ResourceManager HA.
>   4. Changed ports to custom (38088) for "yarn.resourcemanager.webapp.address.rm1" and
"yarn.resourcemanager.webapp.address.rm2" properties.
> 
> Result: YARN service check become failing:
> 
>     
>     
>     
>     stderr: 
>     Traceback (most recent call last):
>       File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py",
line 142, in <module>
>         ServiceCheck().execute()
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py",
line 219, in execute
>         method(env)
>       File "/var/lib/ambari-agent/cache/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py",
line 124, in service_check
>         path='/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin',
>       File "/usr/lib/python2.6/site-packages/resource_management/libraries/functions/get_user_call_output.py",
line 61, in get_user_call_output
>         raise Fail(err_msg)
>     resource_management.core.exceptions.Fail: Execution of 'curl --negotiate -u : -ksL
--connect-timeout 5 http://qe-6521-1-4.novalocal:8088/ws/v1/cluster/apps/application_1452769486546_0003
1>/tmp/tmpC9KCX6 2>/tmp/tmp7j7b8L' returned 7.
>      stdout:
>     2016-01-14 11:22:34,990 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
>     2016-01-14 11:22:35,030 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
>     2016-01-14 11:22:35,036 - checked_call['yarn org.apache.hadoop.yarn.applications.distributedshell.Client
-shell_command ls -num_containers 1 -jar /usr/hdp/current/hadoop-yarn-client/hadoop-yarn-applications-distributedshell.jar']
{'path': '/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin', 'user': 'ambari-qa'}
>     2016-01-14 11:22:50,644 - checked_call returned (0, '######## Hortonworks #############\nThis
is MOTD message, added for testing in qe infra\n16/01/14 11:22:37 INFO impl.TimelineClientImpl:
Timeline service address: http://qe-6521-1-4.novalocal:8188/ws/v1/timeline/\n16/01/14 11:22:38
INFO distributedshell.Client: Initializing Client\n16/01/14 11:22:38 INFO distributedshell.Client:
Running Client\n16/01/14 11:22:39 INFO distributedshell.Client: Got Cluster metric info from
ASM, numNodeManagers=3\n16/01/14 11:22:39 INFO distributedshell.Client: Got Cluster node info
from ASM\n16/01/14 11:22:39 INFO distributedshell.Client: Got node report from ASM for, nodeId=qe-6521-1-4.novalocal:25454,
nodeAddressqe-6521-1-4.novalocal:8042, nodeRackName/default-rack, nodeNumContainers0\n16/01/14
11:22:39 INFO distributedshell.Client: Got node report from ASM for, nodeId=qe-6521-1-2.novalocal:25454,
nodeAddressqe-6521-1-2.novalocal:8042, nodeRackName/default-rack, nodeNumContainers0\n16/01/14
11
 :22:39 INFO distributedshell.Client: Got node report from ASM for, nodeId=qe-6521-1-5.novalocal:25454,
nodeAddressqe-6521-1-5.novalocal:8042, nodeRackName/default-rack, nodeNumContainers0\n16/01/14
11:22:39 INFO distributedshell.Client: Queue info, queueName=default, queueCurrentCapacity=0.0,
queueMaxCapacity=1.0, queueApplicationCount=0, queueChildQueueCount=0\n16/01/14 11:22:39 INFO
distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=SUBMIT_APPLICATIONS\n16/01/14
11:22:39 INFO distributedshell.Client: User ACL Info for Queue, queueName=root, userAcl=ADMINISTER_QUEUE\n16/01/14
11:22:39 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=SUBMIT_APPLICATIONS\n16/01/14
11:22:39 INFO distributedshell.Client: User ACL Info for Queue, queueName=default, userAcl=ADMINISTER_QUEUE\n16/01/14
11:22:39 INFO distributedshell.Client: Max mem capabililty of resources in this cluster 10240\n16/01/14
11:22:39 INFO distributedshell.Client: Max virt
 ual cores capabililty of resources in this cluster 1\n16/01/14 11:22:39 INFO distributedshell.Client:
Copy App Master jar from local filesystem and add to local environment\n16/01/14 11:22:39
INFO distributedshell.Client: Set the environment for the application master\n16/01/14 11:22:39
INFO distributedshell.Client: Setting up app master command\n16/01/14 11:22:39 INFO distributedshell.Client:
Completed setting up app master command {{JAVA_HOME}}/bin/java -Xmx10m org.apache.hadoop.yarn.applications.distributedshell.ApplicationMaster
--container_memory 10 --container_vcores 1 --num_containers 1 --priority 0 1><LOG_DIR>/AppMaster.stdout
2><LOG_DIR>/AppMaster.stderr \n16/01/14 11:22:39 INFO distributedshell.Client: Submitting
application to ASM\n16/01/14 11:22:40 INFO impl.YarnClientImpl: Submitted application application_1452769486546_0003\n16/01/14
11:22:41 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToAMToken=null,
appDiagnostics=, appMasterHost
 =N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1452770560000, yarnAppState=ACCEPTED,
distributedFinalState=UNDEFINED, appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/,
appUser=ambari-qa\n16/01/14 11:22:42 INFO distributedshell.Client: Got application report
from ASM for, appId=3, clientToAMToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default,
appMasterRpcPort=-1, appStartTime=1452770560000, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED,
appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/, appUser=ambari-qa\n16/01/14
11:22:43 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToAMToken=null,
appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1, appStartTime=1452770560000,
yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/,
appUser=ambar
 i-qa\n16/01/14 11:22:44 INFO distributedshell.Client: Got application report from ASM for,
appId=3, clientToAMToken=null, appDiagnostics=, appMasterHost=N/A, appQueue=default, appMasterRpcPort=-1,
appStartTime=1452770560000, yarnAppState=ACCEPTED, distributedFinalState=UNDEFINED, appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/,
appUser=ambari-qa\n16/01/14 11:22:45 INFO distributedshell.Client: Got application report
from ASM for, appId=3, clientToAMToken=null, appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158,
appQueue=default, appMasterRpcPort=-1, appStartTime=1452770560000, yarnAppState=RUNNING, distributedFinalState=UNDEFINED,
appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/, appUser=ambari-qa\n16/01/14
11:22:46 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToAMToken=null,
appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158, appQueue=default, appMasterRpc
 Port=-1, appStartTime=1452770560000, yarnAppState=RUNNING, distributedFinalState=UNDEFINED,
appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/, appUser=ambari-qa\n16/01/14
11:22:47 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToAMToken=null,
appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158, appQueue=default, appMasterRpcPort=-1,
appStartTime=1452770560000, yarnAppState=RUNNING, distributedFinalState=UNDEFINED, appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/,
appUser=ambari-qa\n16/01/14 11:22:48 INFO distributedshell.Client: Got application report
from ASM for, appId=3, clientToAMToken=null, appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158,
appQueue=default, appMasterRpcPort=-1, appStartTime=1452770560000, yarnAppState=RUNNING, distributedFinalState=UNDEFINED,
appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/, appUser=
 ambari-qa\n16/01/14 11:22:49 INFO distributedshell.Client: Got application report from ASM
for, appId=3, clientToAMToken=null, appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158,
appQueue=default, appMasterRpcPort=-1, appStartTime=1452770560000, yarnAppState=RUNNING, distributedFinalState=UNDEFINED,
appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/, appUser=ambari-qa\n16/01/14
11:22:50 INFO distributedshell.Client: Got application report from ASM for, appId=3, clientToAMToken=null,
appDiagnostics=, appMasterHost=qe-6521-1-2/172.22.65.158, appQueue=default, appMasterRpcPort=-1,
appStartTime=1452770560000, yarnAppState=FINISHED, distributedFinalState=SUCCEEDED, appTrackingUrl=http://qe-6521-1-2.novalocal:38088/proxy/application_1452769486546_0003/,
appUser=ambari-qa\n16/01/14 11:22:50 INFO distributedshell.Client: Application has completed
successfully. Breaking monitoring loop\n16/01/14 11:22:50 INFO distributedshell.Client: Application
com
 pleted successfully')
>     2016-01-14 11:22:50,646 - call['ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'curl
--negotiate -u : -ksL --connect-timeout 5 http://qe-6521-1-4.novalocal:8088/ws/v1/cluster/apps/application_1452769486546_0003
1>/tmp/tmpC9KCX6 2>/tmp/tmp7j7b8L''] {'path': '/usr/sbin:/sbin:/usr/local/bin:/bin:/usr/bin',
'quiet': False}
>     2016-01-14 11:22:50,719 - call returned (7, '######## Hortonworks #############\nThis
is MOTD message, added for testing in qe infra')
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/params_linux.py
15ad4b4 
>   ambari-server/src/main/resources/common-services/YARN/2.1.0.2.0/package/scripts/service_check.py
6aca8b2 
> 
> Diff: https://reviews.apache.org/r/42350/diff/
> 
> 
> Testing
> -------
> 
> mvn clean test
> 
> 
> Thanks,
> 
> Andrew Onischuk
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message