flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sachingoel0101 <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-2131][ml]: Initialization schemes for k...
Date Tue, 30 Jun 2015 15:06:26 GMT
Github user sachingoel0101 commented on the pull request:

    Hey @thvasilo , I'm going to break up this PR further. The motivation is that, the Sampling
code should be available as a general feature. Given a probability distribution over data,
user should be able to sample as many points as they want.
    The Sampler will take the DataSet as input, number of samples required and a function
which determines the relative probability of a particular element being picked, apart from
specifying whether the elements should be sampled with replacement or without replacement.

    Let me know your thoughts. I'll work out a version in the meantime. If this is desirable,
I will file a JIRA ticket and open a separate PR.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message