chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "michael yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CHUKWA-680) Pattern recognition of Hadoop generated metrics
Date Fri, 27 Dec 2013 05:27:50 GMT

    [ https://issues.apache.org/jira/browse/CHUKWA-680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13857313#comment-13857313
] 

michael yu commented on CHUKWA-680:
-----------------------------------

Sure thing.

# Each cluster will have its own train model.
# You are correct.  It is more along the lings of typical vs. atypical.
# If the workload changes and the existing training model has never seen it (i.e. has not
processed this kind of relevant data)... then the SVM engine will most likely predict (indicate)
that it's "atypical".  At that point, a notification will be sent to any registered email
addresses.  The user has the ability to correct that "atypical" data point if it actually
is "typical".  If this is done, the model will be retrained.

> Pattern recognition of Hadoop generated metrics
> -----------------------------------------------
>
>                 Key: CHUKWA-680
>                 URL: https://issues.apache.org/jira/browse/CHUKWA-680
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>         Environment: IBM InfoSphere BigInsights Enterprise
>            Reporter: michael yu
>            Assignee: michael yu
>            Priority: Minor
>              Labels: GSoC, GSoC2013
>         Attachments: Yu, Michael et al-project-report-draft.pdf
>
>   Original Estimate: 2,760h
>  Remaining Estimate: 2,760h
>
> Charles Lin and I are working on our IBM SJSU masters project on "Pattern recognition
of Hadoop generated metrics".
> The purpose of the project is to use libsvm to predict the health of the cluster.
> The scope of the project includes:
> 1) gathering large scale data set of metrics for healthy and unhealthy clusters
> 2) use #1 and libsvm to generate training model
> 3) periodic collection of metrics and comparing against training model using libsvm to
predict the cluster health
>    a) if unhealthy, send email notification to system administrator 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message