ambari-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Onischuk" <aonis...@hortonworks.com>
Subject Re: Review Request 38486: Error during update service configurations while kerberizing cluster post Ambari upgrade
Date Sat, 19 Sep 2015 14:34:51 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38486/
-----------------------------------------------------------

(Updated Sept. 19, 2015, 2:34 p.m.)


Review request for Ambari and Dmitro Lisnichenko.


Bugs: AMBARI-13144
    https://issues.apache.org/jira/browse/AMBARI-13144


Repository: ambari


Description
-------

Steps followed:  
1\. Install Ambari 1.6.1 with HDP 2.1.15.0-946  
2\. Enable kerberos  
3\. Upgrade Ambari to 2.1.2-262  
4\. Remove Ganglia service via API  
curl -u admin:admin -H 'X-Requested-By:ambari' -X DELETE
'http://172.22.123.214:8080/api/v1/clusters/cl1/services/GANGLIA'  
5\. Add Ambari Metrics Service  
6\. Re–enable Kerberos and choose Existing MIT KDC - specify valid values in
the wizard and navigate till the 'Kerberize Cluster' screen

Result:  
Error at Update Service Configurations phase (see attached screenshot)

ambari-server.log shows below NPE:

    
    
    17 Sep 2015 19:12:23,336 ERROR [Server Action Executor Worker 1325] ClusterImpl:2411 -
No service found for config type '{}', service config version not created
    17 Sep 2015 19:12:23,534  WARN [Server Action Executor Worker 1325] ServerActionExecutor:479
- Task #1325 failed to complete execution due to thrown exception: java.lang.NullPointerException:null
    java.lang.NullPointerException
            at java.util.HashMap.putAll(HashMap.java:614)
            at org.apache.ambari.server.state.ConfigHelper.updateConfigType(ConfigHelper.java:691)
            at org.apache.ambari.server.serveraction.kerberos.UpdateKerberosConfigsServerAction.execute(UpdateKerberosConfigsServerAction.java:132)
            at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.execute(ServerActionExecutor.java:537)
            at org.apache.ambari.server.serveraction.ServerActionExecutor$Worker.run(ServerActionExecutor.java:474)
            at java.lang.Thread.run(Thread.java:745)
    17 Sep 2015 19:12:24,342  WARN [ambari-action-scheduler] ActionScheduler:311 - Operation
completely failed, aborting request id:89
    17 Sep 2015 19:12:24,342  INFO [ambari-action-scheduler] ActionScheduler:700 - Service
name is , component name is AMBARI_SERVER_ACTIONskipping sending ServiceComponentHostOpFailedEvent
for AMBARI_SERVER_ACTION
    17 Sep 2015 19:12:24,346  INFO [ambari-action-scheduler] ActionDBAccessorImpl:176 - Aborting
command. Hostname vsharma-u21todalm10-re-5.novalocal role AMBARI_SERVER_ACTION requestId null
taskId 1326 stageId null
    17 Sep 2015 19:12:33,092  INFO [qtp-client-22] PersistKeyValueService:82 - Looking for
keyName hostPopup-pagination-displayLength-admin
    17 Sep 2015 19:22:33,681  INFO [qtp-client-22] PersistKeyValueService:82 - Looking for
keyName hostPopup-pagination-displayLength-admin
    

After discussing with rlevas, turns out that the root cause is because the API
call to delete Ganglia did not properly cleanup the entries database leading
to the issue

ambari=> select serviceconfig.service_name, clusterservices.service_name from
serviceconfig left outer join clusterservices using (service_name) where
clusterservices.service_name is null;  
service_name | service_name  
\-------------<del>+</del>\-------------  
GANGLIA |  
GANGLIA |  
(2 rows)


Diffs (updated)
-----

  ambari-server/src/main/java/org/apache/ambari/server/orm/dao/ServiceConfigDAO.java 1063c3f

  ambari-server/src/main/java/org/apache/ambari/server/state/ServiceImpl.java 34c7b81 

Diff: https://reviews.apache.org/r/38486/diff/


Testing
-------

Results :

Tests run: 3180, Failures: 0, Errors: 0, Skipped: 28

----------------------------------------------------------------------
Ran 261 tests in 7.061s

OK
----------------------------------------------------------------------
Total run:819
Total errors:0
Total failures:0
OK


Thanks,

Andrew Onischuk


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message