ambari-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Fernandez <afernan...@hortonworks.com>
Subject Re: Cannot get check status to work for a customer service
Date Sun, 24 May 2015 19:09:02 GMT
​For server-side debugging, modify the ambari_server_main.py file to enable DEBUG and can
then connect a remote debugger from your favorite IDE to port 5005.

For increased debugging, modify the ambari.properties file debug level.

For python logging on the agents, I believe the default logger level is already DEBUG.


Thanks,

Alejandro


________________________________
From: Tim To <tto@phemi.com>
Sent: Friday, May 22, 2015 7:36 PM
To: Alejandro Fernandez
Cc: user@ambari.apache.org
Subject: Re: Cannot get check status to work for a customer service

Thanks Alejandro,
We use python 2.7 and yes I did test the kill -0 with the contents of the pid file and got
the result that you show. It does work now I have a line of Execute() code in the status function
where the OS command was malformed so the method blew up but the exception was eaten so I
didn't know that was a problem. Once I stripped out all the other code, it is now working.
However, I still have the following questions:
- is there a better way to debug? I tried "ambari-server start --debug" because the ambari
code actually have extra debug statement that would've helped me but I can't turn the debug
flag on - is there a config file that I can change to trigger those debug logger statement
to be printed?

- This setup initially wasn't working because the metainfo file and the command script were
missing from the cache directory. It was a last resort for me to manually copy over the scripts
after restarting the ambari server a few times. I am hoping that I am missing a config step
somewhere to cause this - please confirm since ambari wasn't updating the cache directories
properly.

Tim

On Fri, May 22, 2015 at 7:01 PM, Alejandro Fernandez <afernandez@hortonworks.com<mailto:afernandez@hortonworks.com>>
wrote:
Hi Tim,

Only make the changes to ambari-server's /var/lib/ambari-server/resources/stacks/… folder,
then restart ambari-server; this will cause the agents to update their cache.
Restarting ambari-server takes time, so I prefer a shell script, or use pdsh.

What python version are you using?
Have you tried to run this manually?

sleep 60&
[root@c6401 ~]# ps aux | grep sleep
root     29010  0.0  0.0 100904   592 pts/1    S    01:59   0:00 sleep 60
root     29039  0.0  0.0 103236   864 pts/1    S+   01:59   0:00 grep sleep
[root@c6401 ~]# kill -0 29010
[root@c6401 ~]# echo $?
0
[root@c6401 ~]# kill -0 290100000000
-bash: kill: 290100000000: arguments must be process or job Ids

Thanks,
Alejandro

From: Tim To <tto@phemi.com<mailto:tto@phemi.com>>
Reply-To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<mailto:user@ambari.apache.org>>
Date: Friday, May 22, 2015 at 11:26 AM
To: "user@ambari.apache.org<mailto:user@ambari.apache.org>" <user@ambari.apache.org<mailto:user@ambari.apache.org>>
Subject: Re: Cannot get check status to work for a customer service

A quick update:
I was able to find the check_process_status method and all it does is read the pid from the
pid file then do  a kill -0 on the pid so this should work but it doesn't for some reason.
Perhaps I mis a config step where Ambari is not calling the our python script somehow. But
Ambari is calling the script to start our app so any suggestion is appreciated.

One more question I setup the customer script in
/var/lib/ambari-server/resources/stacks/HDP/2.2/... but when ambari starts it complains that
it can't find my script at
/var/lib/ambari-agent/cache/stacks/HDP/2.2/services/... so I manually created the directory
that ambari is looking for and copied the script there and ambari was able to start our app.
Anyone knows how to set a custom service up so I don't have to manually copy my script to
the /var/lib/ambari-agent/cache/... directory ??

Many thanks!

the portion of the check_process_status code I was talking about :

  try:
      pid = int(sudo.read_file(pid_file))
  except:
      Logger.debug("Pid file {0} does not exist".format(pid_file))
      raise ComponentIsNotRunning()

  try:
      # Kill will not actually kill the process
      # From the doc:
      # If sig is 0, then no signal is sent, but error checking is still
      # performed; this can be used to check for the existence of a
      # process ID or process group ID.
      os.kill(pid, 0)
  except OSError:
      Logger.debug("Process with pid {0} is not running. Stale pid file"
                   " at {1}".format(pid, pid_file))
      raise ComponentIsNotRunning()


On Fri, May 22, 2015 at 10:35 AM, Tim To <tto@phemi.com<mailto:tto@phemi.com>>
wrote:
Hi all,
I am using ambari 2.0 and hadoop 2.4. We have a customer process and I've been trying to set
up monitoring and alert for it base on what I leanred from two populate sites:
The offiical ambari one:
https://cwiki.apache.org/confluence/display/AMBARI/Ambari
and another one that has more detailed explanations:
http://mozartanalytics.com/how-to-create-a-software-stack-for-ambari/?preview=true

So far I was able to set it up and I can deploy our custom service and have ambari starts
it for me (so I think the stop will work too). However check status doesn't work. Based on
comments from the second site I'm trying to pass a pid file location to check_process_status()
and magic should happen and Ambari would be able to tell whether this process is working or
not.
here's my python function for status:

  def status(self, env):
    print 'Status of Phemi Central';
    check_process_status('/home/testuser/appName/logs/pid-8888')

I manaully checked the file after ambari started our app and it does contain the correct pid
for the process but ambari still think the app is "stopped".

- Any pointer as to how check status works and how I am suppose to setup up is apprecaited.
- Any more detailed documentation to setup monitoring and alert in addition to the avoe mentioend
website would be greatly helpful (even a confirmation that there is none would save me searching
for more :) )
- I also checked out the latest ambari code from github but have a hard time locatng the where
check status is done so any help with looking for the code would also be helpful.

Thanks in advance everyone!!


--
Tim To
Software Engineer
PHEMI Health Systems
180-887 Great Northern Way
Vancouver, BC V5T 4T5
website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>



--
Tim To
Software Engineer
PHEMI Health Systems
180-887 Great Northern Way
Vancouver, BC V5T 4T5
website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>



--
Tim To
Software Engineer
PHEMI Health Systems
180-887 Great Northern Way
Vancouver, BC V5T 4T5
website<http://www.phemi.com/> twitter<https://twitter.com/PHEMISystems> Linkedin<http://www.linkedin.com/company/3561810?trk=tyah&trkInfo=tarId%3A1403279580554%2Ctas%3Aphemi%20hea%2Cidx%3A1-1-1>
Mime
View raw message