giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-1161) implement random sampling for input splits
Date Fri, 29 Sep 2017 18:37:00 GMT


ASF GitHub Bot commented on GIRAPH-1161:

Github user asfgit closed the pull request at:

> implement random sampling for input splits
> ------------------------------------------
>                 Key: GIRAPH-1161
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Jianlong Zhong
>            Priority: Minor
> Currently if we are reading vertex/edge data from multiple tables, and we only want to
read a fraction of data (with giraph.inputSplitSamplePercent conf option), we'll always get
the first inputSplitSamplePercent of the input slits. We should instead use a random sample
of input splits so testing on sample of data would look closer to actual full data run.

This message was sent by Atlassian JIRA

View raw message