Return-Path: X-Original-To: apmail-ambari-dev-archive@www.apache.org Delivered-To: apmail-ambari-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C305C17DD3 for ; Tue, 13 Jan 2015 10:46:33 +0000 (UTC) Received: (qmail 88210 invoked by uid 500); 13 Jan 2015 10:46:35 -0000 Delivered-To: apmail-ambari-dev-archive@ambari.apache.org Received: (qmail 88179 invoked by uid 500); 13 Jan 2015 10:46:35 -0000 Mailing-List: contact dev-help@ambari.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@ambari.apache.org Delivered-To: mailing list dev@ambari.apache.org Received: (qmail 88167 invoked by uid 99); 13 Jan 2015 10:46:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 13 Jan 2015 10:46:35 +0000 Date: Tue, 13 Jan 2015 10:46:35 +0000 (UTC) From: "Hari Sekhon (JIRA)" To: dev@ambari.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (AMBARI-9022) Kerberos config lost after adding Kafka service, or Oozie service, or any service? MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMBARI-9022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hari Sekhon updated AMBARI-9022: -------------------------------- Summary: Kerberos config lost after adding Kafka service, or Oozie service, or any service? (was: Kerberos config lost after adding Kafka or Oozie service) > Kerberos config lost after adding Kafka service, or Oozie service, or any service? > ---------------------------------------------------------------------------------- > > Key: AMBARI-9022 > URL: https://issues.apache.org/jira/browse/AMBARI-9022 > Project: Ambari > Issue Type: Bug > Affects Versions: 1.7.0 > Environment: HDP 2.2 > Reporter: Hari Sekhon > Priority: Blocker > > Adding the Kafka service to an existing kerberized HDP 2.2 cluster resulted in all the Kerberos fields in core-site.xml getting blank or literal "null" string which prevented all the HDFS and Yarn instances from restarting. This caused a major outage - lucky this cluster isn't prod but this is going to bite somebody badly. > Error observed in NameNode log: > {code}2015-01-07 09:56:01,958 INFO namenode.NameNode (NameNode.java:setClientNamenodeAddress(369)) - Clients are to use nameservice1 to access this namenode/service. > 2015-01-07 09:56:02,055 FATAL namenode.NameNode (NameNode.java:main(1509)) - Failed to start namenode. > java.lang.IllegalArgumentException: Invalid rule: null > at org.apache.hadoop.security.authentication.util.KerberosName.parseRules(KerberosName.java:331) > at org.apache.hadoop.security.authentication.util.KerberosName.setRules(KerberosName.java:397) > at org.apache.hadoop.security.HadoopKerberosName.setConfiguration(HadoopKerberosName.java:75) > at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:263) > at org.apache.hadoop.security.UserGroupInformation.setConfiguration(UserGroupInformation.java:299) > at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:583) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:762) > at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:746) > at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438) > at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504) > 2015-01-07 09:56:02,062 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1 > 2015-01-07 09:56:02,064 INFO namenode.NameNode (StringUtils.java:run(659)) - SHUTDOWN_MSG:{code} > Fields which ended up being with "null" string literals in the value field in core-site.xml: {code}hadoop.http.authentication.kerberos.keytab > hadoop.http.authentication.kerberos.principal > hadoop.security.auth_to_local{code} > Fields which ended up being blank ("") for value field in core-site.xml: > {code}hadoop.http.authentication.cookie.domain > hadoop.http.authentication.cookie.path > hadoop.http.authentication.kerberos.name.rules > hadoop.http.authentication.signature.secret > hadoop.http.authentication.signature.secret.file > hadoop.http.authentication.signer.secret.provider > hadoop.http.authentication.signer.secret.provider.object > hadoop.http.authentication.token.validity > hadoop.http.filter.initializers{code} > Previous revisions showed undefined which was definitely not the case for past months this was a working fully kerberized cluster. > Removing the Kafka service via rest API calls and restarting ambari-server didn't make the config reappear either. > I had to de-kerberize cluster and re-kerberize the whole cluster in Ambari in order to get all those 12 configuration settings re-populated. > A remaining side effect of this bug even after recovering the cluster is that all the previous config revisions are now ruined due to the many undefined values that would prevent the cluster from starting and are therefore no longer viable as a backup to revert to for any reason. There doesn't seem to be much I can workaround that. > Ironically the kafka brokers started up fine after ruining all the core components since Kafka has no security itself. > Regards, > Hari Sekhon > http://www.linkedin.com/in/harisekhon -- This message was sent by Atlassian JIRA (v6.3.4#6332)