flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-1901) Create sample operator for Dataset
Date Wed, 29 Jul 2015 07:56:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645658#comment-14645658

ASF GitHub Bot commented on FLINK-1901:

GitHub user ChengXiangLi opened a pull request:


    [FLINK-1901] [core] Create sample operator for Dataset.

    This PR includes:
    1. 4 random sampler implementation for different sample strategies.
    2. sample operator for DataSet Java API.
    3. random sampler unit test.
    4. sample operator Java API integration test.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChengXiangLi/flink FLINK-1901

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #949
commit f7ba8779b8d6a6d66ab5d4e2435a70e220b1e0fc
Author: chengxiang li <chengxiang.li@intel.com>
Date:   2015-07-22T03:38:13Z

    [FLINK-1901] [core] Create sample operator for Dataset.


> Create sample operator for Dataset
> ----------------------------------
>                 Key: FLINK-1901
>                 URL: https://issues.apache.org/jira/browse/FLINK-1901
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Theodore Vasiloudis
>            Assignee: Chengxiang Li
> In order to be able to implement Stochastic Gradient Descent and a number of other machine
learning algorithms we need to have a way to take a random sample from a Dataset.
> We need to be able to sample with or without replacement from the Dataset, choose the
relative size of the sample, and set a seed for reproducibility.

This message was sent by Atlassian JIRA

View raw message