flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2072) Add a quickstart guide for FlinkML
Date Thu, 11 Jun 2015 07:51:01 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581609#comment-14581609
] 

ASF GitHub Bot commented on FLINK-2072:
---------------------------------------

Github user tillrohrmann commented on a diff in the pull request:

    https://github.com/apache/flink/pull/792#discussion_r32196879
  
    --- Diff: docs/libs/ml/quickstart.md ---
    @@ -24,4 +25,214 @@ under the License.
     * This will be replaced by the TOC
     {:toc}
     
    -Coming soon.
    +## Introduction
    +
    +FlinkML is designed to make learning from your data a straight-forward process, abstracting
away
    +the complexities that usually come with having to deal with big data learning tasks.
In this
    +quick-start guide we will show just how easy it is to solve a simple supervised learning
problem
    +using FlinkML. But first some basics, feel free to skip the next few lines if you're
already
    +familiar with Machine Learning (ML).
    +
    +As defined by Murphy [1] ML deals with detecting patterns in data, and using those
    +learned patterns to make predictions about the future. We can categorize most ML algorithms
into
    +two major categories: Supervised and Unsupervised Learning.
    +
    +* **Supervised Learning** deals with learning a function (mapping) from a set of inputs
    +(features) to a set of outputs. The learning is done using a *training set* of (input,
    +output) pairs that we use to approximate the mapping function. Supervised learning problems
are
    +further divided into classification and regression problems. In classification problems
we try to
    +predict the *class* that an example belongs to, for example whether a user is going to
click on
    +an ad or not. Regression problems one the other hand, are about predicting (real) numerical
    +values, often called the dependent variable, for example what the temperature will be
tomorrow.
    +
    +* **Unsupervised Learning** deals with discovering patterns and regularities in the data.
An example
    +of this would be *clustering*, where we try to discover groupings of the data from the
    +descriptive features. Unsupervised learning can also be used for feature selection, for
example
    +through [principal components analysis](https://en.wikipedia.org/wiki/Principal_component_analysis).
    +
    +## Linking with FlinkML
    +
    +In order to use FlinkML in you project, first you have to
    --- End diff --
    
    your


> Add a quickstart guide for FlinkML
> ----------------------------------
>
>                 Key: FLINK-2072
>                 URL: https://issues.apache.org/jira/browse/FLINK-2072
>             Project: Flink
>          Issue Type: New Feature
>          Components: Documentation, Machine Learning Library
>            Reporter: Theodore Vasiloudis
>            Assignee: Theodore Vasiloudis
>             Fix For: 0.9
>
>
> We need a quickstart guide that introduces users to the core concepts of FlinkML to get
them up and running quickly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message