chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Otis Gospodnetic (JIRA)" <>
Subject [jira] [Commented] (CHUKWA-680) Pattern recognition of Hadoop generated metrics
Date Fri, 27 Dec 2013 04:47:50 GMT


Otis Gospodnetic commented on CHUKWA-680:

Thanks Michael.  I see tables with 1/1 and 100% in Chapter 6, so that must be the accuracy.
 I have more questions :)
# I assume each cluster is different so one has to train a model for one's own cluster?
# Is this really about healthy vs unhealthy or is this more about typical vs. atypically cluster
# If cluster's workload changes, does the model need to be retrained?
(over at we collect lots of metrics from different types of systems,
including Hadoop and HBase, so this is the angle my questions are coming from)

> Pattern recognition of Hadoop generated metrics
> -----------------------------------------------
>                 Key: CHUKWA-680
>                 URL:
>             Project: Chukwa
>          Issue Type: New Feature
>          Components: Data Collection
>         Environment: IBM InfoSphere BigInsights Enterprise
>            Reporter: michael yu
>            Assignee: michael yu
>            Priority: Minor
>              Labels: GSoC, GSoC2013
>         Attachments: Yu, Michael et al-project-report-draft.pdf
>   Original Estimate: 2,760h
>  Remaining Estimate: 2,760h
> Charles Lin and I are working on our IBM SJSU masters project on "Pattern recognition
of Hadoop generated metrics".
> The purpose of the project is to use libsvm to predict the health of the cluster.
> The scope of the project includes:
> 1) gathering large scale data set of metrics for healthy and unhealthy clusters
> 2) use #1 and libsvm to generate training model
> 3) periodic collection of metrics and comparing against training model using libsvm to
predict the cluster health
>    a) if unhealthy, send email notification to system administrator 

This message was sent by Atlassian JIRA

View raw message