helix-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HELIX-622) Add new resource configuration option to allow resource to disable emmiting monitoring bean.
Date Mon, 11 Jan 2016 20:06:40 GMT

    [ https://issues.apache.org/jira/browse/HELIX-622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15092582#comment-15092582
] 

ASF GitHub Bot commented on HELIX-622:
--------------------------------------

GitHub user lei-xia opened a pull request:

    https://github.com/apache/helix/pull/41

    A few more task framework improvement

    This pull request includes four diffs (with each described as below):
    
    1.  [HELIX-622] Add new resource configuration option to allow resource to disable emmiting
monitoring bean.
      Description:
        Helix creates a set of metrics for each resource. Since job is treated as a regular
resource by Helix, each job will emit a set of new metrics to our internal monitoring system.
But these metrics are dynamic date metrics, most of them are empty, it is meaningless to put
any alerts on them, they are barely used in practice, but merely consuming the metric name
space.
    
      On the other hand, however, we still need some stable metrics (fix set of metric names)
for operational team to monitor the queue and job running status.
    
      For short term solution, we can add an option in JobConfig to enable emitting a metric
for this job, by default, this is disabled. As a next step, we will need to add a new set
of metrics for jobs and workflows.
    
    
    2.  Do not expose internal configuration field name, this field names should be used only
by Helix,  Client should always use JobConfig.Builder to create jobConfig, and construct jobConfig
from HelixProperty before get fields from JobConfig. Client is not recommended to interpret
fields from ZNRecord directly.
    
    3. Clean up integration tests for task framework, move shared parts to TaskTestUtil.java.
    
    4.  Job hung if the target resource does not exist anymore at the time when it is scheduled.
      Problem: When the job gets scheduled, if the target resource does not exist any more
(e,g, database already deleted but the backup job is still there),  the job is stuck and all
the rest of jobs are stuck.
     Change:If the target resource of a job does not exist, the job should be failed immediately.
 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lei-xia/helix helix-0.6.x

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/helix/pull/41.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #41
    
----
commit 32c463d9156017f048fe53830872efc26e99b7db
Author: Lei Xia <lxia@linkedin.com>
Date:   2016-01-09T01:14:00Z

    [HELIX-622] Add new resource configuration option to allow resource to disable emmiting
monitoring bean.

commit a108cfb348b8ea0fdac3764b6c1672755fe64489
Author: Lei Xia <lxia@linkedin.com>
Date:   2016-01-09T01:25:09Z

    [HELIX-623] Do not expose internal configuration field name. Client should use JobConfig.Builder
to create jobConfig.

commit f72627c7d2c7aa9b31fa69c5832226396995c20a
Author: Lei Xia <lxia@linkedin.com>
Date:   2016-01-09T01:27:01Z

    Clean up unit tests for task framework.

commit 8e2bf24c293afebd83076d9ee810cef4e43ab915
Author: Lei Xia <lxia@linkedin.com>
Date:   2016-01-09T01:28:17Z

    [HELIX-618]  Job hung if the target resource does not exist anymore at the time when it
is scheduled.

----


> Add new resource configuration option to allow resource to disable emmiting monitoring
bean.
> --------------------------------------------------------------------------------------------
>
>                 Key: HELIX-622
>                 URL: https://issues.apache.org/jira/browse/HELIX-622
>             Project: Apache Helix
>          Issue Type: Bug
>            Reporter: Lei Xia
>            Assignee: Lei Xia
>
> Helix creates a set of metrics for each resource. Since job is treated as a regular resource
by Helix, each job will emit a set of new metrics to ingraph.  But these metrics are dynamic
date metrics, most of them are empty, it is meaningless to put any alerts on them, they are
barely used in practice. 
> On the other hand, however, we still need some stable metrics (fix set of metric names)
for operational team to monitor the queue and job running status.
> For short term solution, we can add an option in JobConfig to enable emitting a metric
for this job, by default, this is disabled.  As a next step, we will need to add a new set
of metrics for jobs and workflows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message