giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (GIRAPH-1161) implement random sampling for input splits
Date Fri, 29 Sep 2017 18:37:00 GMT

    [ https://issues.apache.org/jira/browse/GIRAPH-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186212#comment-16186212
] 

Hudson commented on GIRAPH-1161:
--------------------------------

FAILURE: Integrated in Jenkins build Giraph-trunk-Commit #1719 (See [https://builds.apache.org/job/Giraph-trunk-Commit/1719/])
GIRAPH-1161 (majakabiljo: [http://git-wip-us.apache.org/repos/asf?p=giraph.git&a=commit&h=3bbac90fc37d225ff4eefaab018fe6147a5a0937])
* (edit) giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java


> implement random sampling for input splits
> ------------------------------------------
>
>                 Key: GIRAPH-1161
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1161
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Jianlong Zhong
>            Priority: Minor
>
> Currently if we are reading vertex/edge data from multiple tables, and we only want to
read a fraction of data (with giraph.inputSplitSamplePercent conf option), we'll always get
the first inputSplitSamplePercent of the input slits. We should instead use a random sample
of input splits so testing on sample of data would look closer to actual full data run.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message