flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ChengXiangLi <...@git.apache.org>
Subject [GitHub] flink pull request: [FLINK-7] [Runtime] Enable Range Partitioner.
Date Mon, 21 Dec 2015 12:53:31 GMT
Github user ChengXiangLi commented on the pull request:

    Sorry, @fhueske , i misunderstood your test data, the keys should be skewed on some value,
while in my previous test, the keys are now skewed. it's complicate to calculate how many
samples should be taken from a dataset to meet an a priori specified accuracy guarantee, one
of the algorithm is described at http://research.microsoft.com/pubs/159275/MSR-TR-2012-18.pdf
which i used before, but it should not totally fit into the case which keys are skewed.
    Would you continue to test how much it required to make partition roughly balanced? Raise
the sample number should not add much overhead, i'm totally support of it.

If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.

View raw message