hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Samir Ahmic (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7386) Investigate providing some supervisor support for znode deletion
Date Fri, 21 Jul 2017 20:07:00 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16096776#comment-16096776
] 

Samir Ahmic commented on HBASE-7386:
------------------------------------

Thanks [~stack].  Why python supervisor? Well we originally started this story around it,
and after some time testing it, at least for me,  choosing mature and well proven process
control system instead of writing custom bash scripts has multiple advantages. 
To be honest work here extends original issue of just removing stale znodes to creating watchdog
over hbase processes and making alternative option for managing cluster but when we started
tackling supervisor approach why not offer folks chance to 
less worry when rs process dies because it will be automatically restarted :) 
Also python supervisor has set of very cool futures like, auto-restart, event listeners (that
may execute arbitrary code based on process state) an so on, and folks may start creating
 they own  listeners for different proposes.
Btw i will address shellcheck and pylint issues in next patch. 

> Investigate providing some supervisor support for znode deletion
> ----------------------------------------------------------------
>
>                 Key: HBASE-7386
>                 URL: https://issues.apache.org/jira/browse/HBASE-7386
>             Project: HBase
>          Issue Type: Task
>          Components: master, regionserver, scripts
>            Reporter: Gregory Chanan
>            Assignee: stack
>            Priority: Blocker
>             Fix For: 3.0.0
>
>         Attachments: HBASE-7386-bin.patch, HBASE-7386-bin-v2.patch, HBASE-7386-bin-v3.patch,
HBASE-7386-conf.patch, HBASE-7386-conf-v2.patch, HBASE-7386-conf-v3.patch, HBASE-7386-master-00.patch,
HBASE-7386-src.patch, HBASE-7386-v0.patch, supervisordconfigs-v0.patch
>
>
> There a couple of JIRAs for deleting the znode on a process failure:
> HBASE-5844 (RS)
> HBASE-5926 (Master)
> which are pretty neat; on process failure, they delete the znode of the underlying process
so HBase can recover faster.
> These JIRAs were implemented via the startup scripts; i.e. the script hangs around and
waits for the process to exit, then deletes the znode.
> There are a few problems associated with this approach, as listed in the below JIRAs:
> 1) Hides startup output in script
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 2) two hbase processes listed per launched daemon
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 3) Not run by a real supervisor
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463409&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463409
> 4) Weird output after kill -9 actual process in standalone mode
> https://issues.apache.org/jira/browse/HBASE-5926?focusedCommentId=13506801&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506801
> 5) Can kill existing RS if called again
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13463401&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13463401
> 6) Hides stdout/stderr[6]
> https://issues.apache.org/jira/browse/HBASE-5844?focusedCommentId=13506832&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13506832
> I suspect running in via something like supervisor.d can solve these issues if we provide
the right support.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message