ambari-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sumit Mohanty (JIRA)" <j...@apache.org>
Subject [jira] [Reopened] (AMBARI-19930) The service check status was set to TIMEOUT even if service check was failed
Date Sun, 12 Feb 2017 06:24:41 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-19930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sumit Mohanty reopened AMBARI-19930:
------------------------------------

Needs to be committed to branch-2.5

> The service check status was set to TIMEOUT even if service check was failed
> ----------------------------------------------------------------------------
>
>                 Key: AMBARI-19930
>                 URL: https://issues.apache.org/jira/browse/AMBARI-19930
>             Project: Ambari
>          Issue Type: Bug
>            Reporter: Yesha Vora
>            Assignee: Myroslav Papirkovskyi
>
> Steps to reproduce:
> * Install a cluster with Hadoop, Tez, Hbase , Hive, Spark
> * Enable Wire encryption
> * Run Tez service check
> Here, agent.service.check.task.timeout is set to 600 sec. Tez application was started
in background. The service check then  tries to find out SUCCESS file for couple of minutes
only. In this particular instance, the application took 5 minutes to run. Thus, the check
for SUCCESS file on HDFS failed. 
> In this scenario, the status for service check should be failed instead Timeout.
> {code}
> stderr:   /var/lib/ambari-agent/data/errors-370.txt
> stdout:   /var/lib/ambari-agent/data/output-370.txt
> 2017-02-08 03:55:55,017 - HdfsResource['/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz'] {'security_enabled':
True, 'hadoop_bin_dir': '/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab',
'source': '/usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz', 'dfs_type': '', 'default_fs': 'hdfs://host:8020',
'replace_existing_files': False, 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore',
'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs@EXAMPLE.COM',
'user': 'hdfs', 'owner': 'hdfs', 'group': 'hadoop', 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf',
'type': 'file', 'action': ['create_on_execute'], 'immutable_paths': [u'/apps/hive/warehouse',
u'/mr-history/done', u'/app-logs', u'/tmp'], 'mode': 0444}
> 2017-02-08 03:55:55,017 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/hdfs.headless.keytab
hdfs@EXAMPLE.COM'] {'user': 'hdfs'}
> 2017-02-08 03:55:55,096 - call['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'curl -sS -L
-w '"'"'%{http_code}'"'"' -X GET --negotiate -u : -k '"'"'https://host:50470/webhdfs/v1/hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz?op=GETFILESTATUS&user.name=hdfs'"'"'
1>/tmp/tmpoIadeN 2>/tmp/tmp6nFiLj''] {'logoutput': None, 'quiet': False}
> 2017-02-08 03:55:55,292 - call returned (0, '')
> 2017-02-08 03:55:55,293 - DFS file /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz is identical
to /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz, skipping the copying
> 2017-02-08 03:55:55,293 - Will attempt to copy tez tarball from /usr/hdp/2.6.0.0-xxx/tez/lib/tez.tar.gz
to DFS at /hdp/apps/2.6.0.0-xxx/tez/tez.tar.gz.
> 2017-02-08 03:55:55,293 - HdfsResource[None] {'security_enabled': True, 'hadoop_bin_dir':
'/usr/hdp/current/hadoop-client/bin', 'keytab': '/etc/security/keytabs/hdfs.headless.keytab',
'dfs_type': '', 'default_fs': 'hdfs://host:8020', 'hdfs_resource_ignore_file': '/var/lib/ambari-agent/data/.hdfs_resource_ignore',
'hdfs_site': ..., 'kinit_path_local': '/usr/bin/kinit', 'principal_name': 'hdfs@EXAMPLE.COM',
'user': 'hdfs', 'action': ['execute'], 'hadoop_conf_dir': '/usr/hdp/current/hadoop-client/conf',
'immutable_paths': [u'/apps/hive/warehouse', u'/mr-history/done', u'/app-logs', u'/tmp']}
> 2017-02-08 03:55:55,294 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/smokeuser.headless.keytab
ambari-qa-cl1@EXAMPLE.COM;'] {'user': 'ambari-qa'}
> 2017-02-08 03:55:55,389 - ExecuteHadoop['jar /usr/hdp/current/tez-client/tez-examples*.jar
orderedwordcount /tmp/tezsmokeinput/sample-tez-test /tmp/tezsmokeoutput/'] {'try_sleep': 5,
'tries': 3, 'bin_dir': '/usr/hdp/current/hadoop-client/bin', 'user': 'ambari-qa', 'conf_dir':
'/usr/hdp/current/hadoop-client/conf'}
> 2017-02-08 03:55:55,390 - Execute['hadoop --config /usr/hdp/current/hadoop-client/conf
jar /usr/hdp/current/tez-client/tez-examples*.jar orderedwordcount /tmp/tezsmokeinput/sample-tez-test
/tmp/tezsmokeoutput/'] {'logoutput': None, 'try_sleep': 5, 'environment': {}, 'tries': 3,
'user': 'ambari-qa', 'path': ['/usr/hdp/current/hadoop-client/bin']}{code}
> {code}
> Requests: {
> aborted_task_count: 0,
> cluster_name: "cl1",
> completed_task_count: 1,
> create_time: 1486526151743,
> end_time: 1486526463038,
> exclusive: false,
> failed_task_count: 0,
> id: 29,
> inputs: "{}",
> operation_level: null,
> progress_percent: 100,
> queued_task_count: 0,
> request_context: "WE API TEZ Service Check",
> request_schedule: null,
> request_status: "TIMEDOUT",
> resource_filters: [
> {
> service_name: "TEZ"
> }
> ],
> start_time: 1486526151751,
> task_count: 1,
> timed_out_task_count: 1,
> type: "COMMAND"
> },{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message