flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2533) Gap based random sample optimization
Date Mon, 14 Sep 2015 21:34:46 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744318#comment-14744318
] 

ASF GitHub Bot commented on FLINK-2533:
---------------------------------------

Github user fhueske commented on the pull request:

    https://github.com/apache/flink/pull/1110#issuecomment-140212159
  
    I will merge this PR tomorrow.


> Gap based random sample optimization
> ------------------------------------
>
>                 Key: FLINK-2533
>                 URL: https://issues.apache.org/jira/browse/FLINK-2533
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chengxiang Li
>            Assignee: GaoLun
>            Priority: Minor
>
> For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap based
random sampler could exploit O(k) sample implementation instead of previous O\(n\) sample
implementation, it should perform better while sample fraction is very small. [This blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/]
describes more detail about gap based random sampler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message