ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitry Lysnichenko (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (AMBARI-13295) ACCUMULO_TRACER START failed after enabling Kerberos
Date Fri, 02 Oct 2015 10:43:26 GMT

     [ https://issues.apache.org/jira/browse/AMBARI-13295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitry Lysnichenko updated AMBARI-13295:
----------------------------------------
    Affects Version/s:     (was: 2.1.3)
                       2.1.2

> ACCUMULO_TRACER START failed after enabling Kerberos
> ----------------------------------------------------
>
>                 Key: AMBARI-13295
>                 URL: https://issues.apache.org/jira/browse/AMBARI-13295
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.1.2
>            Reporter: Dmitry Lysnichenko
>            Assignee: Dmitry Lysnichenko
>             Fix For: 2.1.3
>
>         Attachments: AMBARI-13295.patch
>
>
> After enabling Kerberos on the "Start and Test Services" step ACCUMULO_TRACER START failed.
> {code}
> "stderr" : "Python script has been killed due to timeout after waiting 180 secs",
> {code}
> {code}
> "stdout" : "2015-09-25 14:42:53,963 - Group['custom-spark'] {}\n2015-09-25 14:42:53,964
- Group['hadoop'] {}\n2015-09-25 14:42:53,965 - Group['custom-users'] {}\n2015-09-25 14:42:53,965
- Group['custom-knox-group'] {}\n2015-09-25 14:42:53,965 - User['custom-sqoop'] {'gid': 'hadoop',
'groups': [u'hadoop']}\n2015-09-25 14:42:53,966 - User['custom-knox'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,967 - User['custom-hdfs'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,968 - User['custom-oozie'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25
14:42:53,969 - User['custom-smoke'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25
14:42:53,970 - User['custom-hbase'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,971
- User['custom-tez'] {'gid': 'hadoop', 'groups': [u'custom-users']}\n2015-09-25 14:42:53,972
- User['custom-hive'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,973 -
User['custom-mr'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,973 - User['custom-accumulo']
{'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,974 - User['custom-hcat'] {'gid':
'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,975 - User['custom-ams'] {'gid': 'hadoop',
'groups': [u'hadoop']}\n2015-09-25 14:42:53,976 - User['custom-yarn'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,977 - User['custom-falcon'] {'gid': 'hadoop', 'groups':
[u'custom-users']}\n2015-09-25 14:42:53,977 - User['custom-spark'] {'gid': 'hadoop', 'groups':
[u'hadoop']}\n2015-09-25 14:42:53,978 - User['custom-atlas'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25
14:42:53,979 - User['custom-flume'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,980
- User['custom-kafka'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,981 -
User['custom-zookeeper'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982
- User['custom-mahout'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,982
- User['custom-storm'] {'gid': 'hadoop', 'groups': [u'hadoop']}\n2015-09-25 14:42:53,983 -
File['/var/lib/ambari-agent/tmp/changeUid.sh'] {'content': StaticFile('changeToSecureUid.sh'),
'mode': 0555}\n2015-09-25 14:42:53,985 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke
/tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
{'not_if': '(test $(id -u custom-smoke) -gt 1000) || (false)'}\n2015-09-25 14:42:53,991 -
Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-smoke /tmp/hadoop-custom-smoke,/tmp/hsperfdata_custom-smoke,/home/custom-smoke,/tmp/custom-smoke,/tmp/sqoop-custom-smoke']
due to not_if\n2015-09-25 14:42:53,991 - Directory['/tmp/hbase-hbase'] {'owner': 'custom-hbase',
'recursive': True, 'mode': 0775, 'cd_access': 'a'}\n2015-09-25 14:42:53,992 - File['/var/lib/ambari-agent/tmp/changeUid.sh']
{'content': StaticFile('changeToSecureUid.sh'), 'mode': 0555}\n2015-09-25 14:42:53,993 - Execute['/var/lib/ambari-agent/tmp/changeUid.sh
custom-hbase /home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
{'not_if': '(test $(id -u custom-hbase) -gt 1000) || (false)'}\n2015-09-25 14:42:53,999 -
Skipping Execute['/var/lib/ambari-agent/tmp/changeUid.sh custom-hbase /home/custom-hbase,/tmp/custom-hbase,/usr/bin/custom-hbase,/var/log/custom-hbase,/tmp/hbase-hbase']
due to not_if\n2015-09-25 14:42:54,000 - Group['custom-hdfs'] {'ignore_failures': False}\n2015-09-25
14:42:54,000 - User['custom-hdfs'] {'ignore_failures': False, 'groups': [u'hadoop', u'custom-hdfs']}\n2015-09-25
14:42:54,001 - Directory['/etc/hadoop'] {'mode': 0755}\n2015-09-25 14:42:54,019 - File['/usr/hdp/current/hadoop-client/conf/hadoop-env.sh']
{'content': InlineTemplate(...), 'owner': 'root', 'group': 'hadoop'}\n2015-09-25 14:42:54,019
- Directory['/var/lib/ambari-agent/tmp/hadoop_java_io_tmpdir'] {'owner': 'custom-hdfs', 'group':
'hadoop', 'mode': 0777}\n2015-09-25 14:42:54,032 - Execute[('setenforce', '0')] {'not_if':
'(! which getenforce ) || (which getenforce && getenforce | grep -q Disabled)', 'sudo':
True, 'only_if': 'test -f /selinux/enforce'}\n2015-09-25 14:42:54,039 - Skipping Execute[('setenforce',
'0')] due to not_if\n2015-09-25 14:42:54,040 - Directory['/grid/0/log/hadoop'] {'owner': 'root',
'mode': 0775, 'group': 'hadoop', 'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,043
- Directory['/var/run/hadoop'] {'owner': 'root', 'group': 'root', 'recursive': True, 'cd_access':
'a'}\n2015-09-25 14:42:54,043 - Directory['/tmp/hadoop-custom-hdfs'] {'owner': 'custom-hdfs',
'recursive': True, 'cd_access': 'a'}\n2015-09-25 14:42:54,048 - File['/usr/hdp/current/hadoop-client/conf/commons-logging.properties']
{'content': Template('commons-logging.properties.j2'), 'owner': 'root'}\n2015-09-25 14:42:54,051
- File['/usr/hdp/current/hadoop-client/conf/health_check'] {'content': Template('health_check.j2'),
'owner': 'root'}\n2015-09-25 14:42:54,051 - File['/usr/hdp/current/hadoop-client/conf/log4j.properties']
{'content': ..., 'owner': 'custom-hdfs', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,074
- File['/usr/hdp/current/hadoop-client/conf/hadoop-metrics2.properties'] {'content': Template('hadoop-metrics2.properties.j2'),
'owner': 'custom-hdfs'}\n2015-09-25 14:42:54,075 - File['/usr/hdp/current/hadoop-client/conf/task-log4j.properties']
{'content': StaticFile('task-log4j.properties'), 'mode': 0755}\n2015-09-25 14:42:54,076 -
File['/usr/hdp/current/hadoop-client/conf/configuration.xsl'] {'owner': 'custom-hdfs', 'group':
'hadoop'}\n2015-09-25 14:42:54,083 - File['/etc/hadoop/conf/topology_mappings.data'] {'owner':
'custom-hdfs', 'content': Template('topology_mappings.data.j2'), 'only_if': 'test -d /etc/hadoop/conf',
'group': 'hadoop'}\n2015-09-25 14:42:54,089 - File['/etc/hadoop/conf/topology_script.py']
{'content': StaticFile('topology_script.py'), 'only_if': 'test -d /etc/hadoop/conf', 'mode':
0755}\n2015-09-25 14:42:54,275 - Directory['/usr/hdp/current/accumulo-tracer/conf'] {'owner':
'custom-accumulo', 'group': 'hadoop', 'recursive': True, 'mode': 0755}\n2015-09-25 14:42:54,277
- Directory['/usr/hdp/current/accumulo-tracer/conf/server'] {'owner': 'custom-accumulo', 'group':
'hadoop', 'recursive': True, 'mode': 0700}\n2015-09-25 14:42:54,278 - XmlConfig['accumulo-site.xml']
{'group': 'hadoop', 'conf_dir': '/usr/hdp/current/accumulo-tracer/conf/server', 'mode': 0600,
'configuration_attributes': {}, 'owner': 'custom-accumulo', 'configurations': ...}\n2015-09-25
14:42:54,292 - Generating config: /usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml\n2015-09-25
14:42:54,293 - File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-site.xml'] {'owner':
'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': 0600, 'encoding':
'UTF-8'}\n2015-09-25 14:42:54,317 - Directory['/var/run/accumulo'] {'owner': 'custom-accumulo',
'group': 'hadoop', 'recursive': True}\n2015-09-25 14:42:54,318 - Directory['/grid/0/log/accumulo']
{'owner': 'custom-accumulo', 'group': 'hadoop', 'recursive': True}\n2015-09-25 14:42:54,323
- File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-env.sh'] {'content': InlineTemplate(...),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 0644}\n2015-09-25 14:42:54,324 - PropertiesFile['/usr/hdp/current/accumulo-tracer/conf/server/client.conf']
{'owner': 'custom-accumulo', 'group': 'hadoop', 'properties': {'instance.zookeeper.host':
u'ambari-ooziehive-r1-2.novalocal:2181,ambari-ooziehive-r1-3.novalocal:2181,ambari-ooziehive-r1-5.novalocal:2181',
'instance.name': u'hdp-accumulo-instance', 'instance.rpc.sasl.enabled': True, 'instance.zookeeper.timeout':
u'30s'}}\n2015-09-25 14:42:54,329 - Generating properties file: /usr/hdp/current/accumulo-tracer/conf/server/client.conf\n2015-09-25
14:42:54,329 - File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] {'owner':
'custom-accumulo', 'content': InlineTemplate(...), 'group': 'hadoop', 'mode': None}\n2015-09-25
14:42:54,332 - Writing File['/usr/hdp/current/accumulo-tracer/conf/server/client.conf'] because
contents don't match\n2015-09-25 14:42:54,333 - File['/usr/hdp/current/accumulo-tracer/conf/server/log4j.properties']
{'content': ..., 'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': 0644}\n2015-09-25
14:42:54,333 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,337
- File['/usr/hdp/current/accumulo-tracer/conf/server/auditLog.xml'] {'content': Template('auditLog.xml.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,337 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,341
- File['/usr/hdp/current/accumulo-tracer/conf/server/generic_logger.xml'] {'content': Template('generic_logger.xml.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,342 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,344
- File['/usr/hdp/current/accumulo-tracer/conf/server/monitor_logger.xml'] {'content': Template('monitor_logger.xml.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,345 - File['/usr/hdp/current/accumulo-tracer/conf/server/accumulo-metrics.xml']
{'content': StaticFile('accumulo-metrics.xml'), 'owner': 'custom-accumulo', 'group': 'hadoop',
'mode': 0644}\n2015-09-25 14:42:54,346 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/tracers']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,348
- File['/usr/hdp/current/accumulo-tracer/conf/server/tracers'] {'content': Template('tracers.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,349 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/gc']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,351
- File['/usr/hdp/current/accumulo-tracer/conf/server/gc'] {'content': Template('gc.j2'), 'owner':
'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,352 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/monitor']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,354
- File['/usr/hdp/current/accumulo-tracer/conf/server/monitor'] {'content': Template('monitor.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,355 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/slaves']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,357
- File['/usr/hdp/current/accumulo-tracer/conf/server/slaves'] {'content': Template('slaves.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,357 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/masters']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,359
- File['/usr/hdp/current/accumulo-tracer/conf/server/masters'] {'content': Template('masters.j2'),
'owner': 'custom-accumulo', 'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,360 - TemplateConfig['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
{'owner': 'custom-accumulo', 'template_tag': None, 'group': 'hadoop'}\n2015-09-25 14:42:54,368
- File['/usr/hdp/current/accumulo-tracer/conf/server/hadoop-metrics2-accumulo.properties']
{'content': Template('hadoop-metrics2-accumulo.properties.j2'), 'owner': 'custom-accumulo',
'group': 'hadoop', 'mode': None}\n2015-09-25 14:42:54,369 - Execute['/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab
custom-accumulo@EXAMPLE.COM; ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server
/usr/hdp/current/accumulo-client/bin/accumulo init --reset-security --user custom-accumulo@EXAMPLE.COM
--password NA >/grid/0/log/accumulo/accumulo-reset.out 2>/grid/0/log/accumulo/accumulo-reset.err']
{'not_if': 'ambari-sudo.sh su custom-accumulo -l -s /bin/bash -c \\'/usr/bin/kinit -kt /etc/security/keytabs/accumulo.headless.keytab
custom-accumulo@EXAMPLE.COM; ACCUMULO_CONF_DIR=/usr/hdp/current/accumulo-tracer/conf/server
/usr/hdp/current/accumulo-client/bin/accumulo shell -e \"userpermissions -u custom-accumulo@EXAMPLE.COM\"
| grep System.CREATE_TABLE\\'', 'user': 'custom-accumulo'}",
> {code}
> tserver log contains the following exceptions
> {code}
> 2015-09-25 14:29:38,821 [tserver.TabletServer] INFO : Started replication service on
ambari-ooziehive-r1-2.novalocal:10002
> 2015-09-25 14:29:55,489 [server.TThreadPoolServer] ERROR: Error occurred during processing
of message.
> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException
> 	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:360)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.transport.TTransportException
> 	at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> 	at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> 	at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
> 	at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
> 	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> 	at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> 	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> 	... 11 more
> 2015-09-25 14:30:01,812 [tserver.TabletServer] INFO : Loading tablet !0<;~
> 2015-09-25 14:30:01,894 [tserver.TabletServer] INFO : ambari-ooziehive-r1-2.novalocal:9997:
got assignment from master: !0<;~
> 2015-09-25 14:30:02,833 [util.MetadataTableUtil] INFO : Scanning logging entries for
!0<;~
> 2015-09-25 14:30:02,862 [util.MetadataTableUtil] INFO : Scanning metadata for logs used
for tablet !0<;~
> 2015-09-25 14:30:02,924 [util.MetadataTableUtil] INFO : Returning logs [] for extent
!0<;~
> 2015-09-25 14:30:34,637 [server.TThreadPoolServer] ERROR: Error occurred during processing
of message.
> java.lang.RuntimeException: org.apache.thrift.transport.TTransportException: Peer indicated
failure: GSS initiate failed
> 	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:51)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory$1.run(UGIAssumingTransportFactory.java:48)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:360)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
> 	at org.apache.accumulo.core.rpc.UGIAssumingTransportFactory.getTransport(UGIAssumingTransportFactory.java:48)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:208)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.thrift.transport.TTransportException: Peer indicated failure: GSS
initiate failed
> 	at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:190)
> 	at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
> 	at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
> 	at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
> 	at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
> 	... 11 more
> {code}
> Live (another 48 hours) cluster which happened fail:
> 172.22.90.201	ambari-ooziehive-r1-5.novalocal	ambari-ooziehive-r1-5
> 172.22.90.200	ambari-ooziehive-r1-2.novalocal	ambari-ooziehive-r1-2
> 172.22.90.198	ambari-ooziehive-r1-3.novalocal	ambari-ooziehive-r1-3
> 172.22.90.197	ambari-ooziehive-r1-4.novalocal	ambari-ooziehive-r1-4
> 172.22.90.199	ambari-ooziehive-r1-1.novalocal	ambari-ooziehive-r1-1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message