flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2533) Gap based random sample optimization
Date Mon, 14 Sep 2015 12:44:46 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743460#comment-14743460

ASF GitHub Bot commented on FLINK-2533:

Github user fhueske commented on the pull request:

    @gallenvara, thanks for the PR.
    Looks good to me.

> Gap based random sample optimization
> ------------------------------------
>                 Key: FLINK-2533
>                 URL: https://issues.apache.org/jira/browse/FLINK-2533
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chengxiang Li
>            Assignee: GaoLun
>            Priority: Minor
> For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap based
random sampler could exploit O(k) sample implementation instead of previous O\(n\) sample
implementation, it should perform better while sample fraction is very small. [This blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/]
describes more detail about gap based random sampler.

This message was sent by Atlassian JIRA

View raw message