hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion
Date Wed, 19 Dec 2012 07:31:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13535744#comment-13535744
] 

stack commented on HBASE-7386:
------------------------------

I'd say just repurpose 'autorestart', especially if now broke.  What was there previous was
mickey mouse.  This is real deal.

bq. ... and does not respect any other environment settings (e.g. HBASE_CONF_DIR).

Would this be fixed if we "...through all the config files in hbase-daemon and do something
appropriate."?

On questions:

1. Yes this is valid direction.  Perhaps we could extract the stuff you hacked out into a
'wrapper' script, a poor-mans' supervise such that it was there as an option... you could
run it if you wanted poor-mans' supervise but otherwise, scripts ran as they used to.  But
this would likely be wasted effort... effort better spent getting it so optionally, if supervisord
was installed, you could just run with it.
2. I agree with nkeywal that templates/samples inevitably rot.  Unused software also rots
so providing supervisord scripts unless they are used, they will go bad.  How much work involved
making it so could do ./bin/start-supervisord-hbase.sh?  Would be coolio if you could do ./bin/start-hbase.sh
and ./bin/start-supervisord-hbase.sh if supervisor available (likely on most systems I'd say)
and then in doc. we encourage folks to do the latter.

What to do for the case where a shop has chosen other than supervisord to monitor their processes?
 I suppose we could let them do the convertion from 'supervise' to 'god', etc.?

This is great stuff G.





                
> Investigate providing some supervisor support for znode deletion
> ----------------------------------------------------------------
>
>                 Key: HBASE-7386
>                 URL: https://issues.apache.org/jira/browse/HBASE-7386
>             Project: HBase
>          Issue Type: Task
>          Components: master, regionserver, scripts
>            Reporter: Gregory Chanan
>            Assignee: Gregory Chanan
>             Fix For: 0.96.0
>
>         Attachments: HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the underlying process
so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs around and
waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues if we provide
the right support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message