flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Hogan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2715) Benchmark Triangle Count methods
Date Fri, 22 Apr 2016 17:09:13 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254230#comment-15254230

Greg Hogan commented on FLINK-2715:

The performance of {{TriangleEnumerator}} was considerably worse until the recent fixes in
FLINK-3770.  This algorithm could also be updated to initially order edges by lower degree
rather than higher ID. It should also run faster with the upcoming hashing combiner. The use
of {{TreeMap}} likely limits the performance relative to {{TriangleEnumerator}}.

Implementation of the Global Clustering Coefficient requires the triangle count and I've been
working on what I think will be a nice way to capture algorithm metrics without duplicating

The Flink bug has been filed as FLINK-3805.

> Benchmark Triangle Count methods
> --------------------------------
>                 Key: FLINK-2715
>                 URL: https://issues.apache.org/jira/browse/FLINK-2715
>             Project: Flink
>          Issue Type: Task
>          Components: Gelly
>    Affects Versions: 0.10.0
>            Reporter: Andra Lungu
>            Priority: Minor
>              Labels: starter
> Once FLINK-2714 is addressed, it would be nice to have a set of benchmarks that test
the efficiency of the DataSet, GSA and vertex-centric versions. 
> This means running the three examples on a cluster environment using various graph DataSets.
For instance, SNAP's Orkut and Friendster networks
> (https://snap.stanford.edu/data/).    
> The results produced by the experiments should then be reported in the Gelly docs.

This message was sent by Atlassian JIRA

View raw message