flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3277) Use Value types in Gelly API
Date Thu, 28 Apr 2016 17:34:12 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15262588#comment-15262588

ASF GitHub Bot commented on FLINK-3277:

Github user greghogan commented on the pull request:

    The two implementations have small differences but the algorithm is the same. I'll be
removing the two steps which are concerned with degree skew since I had not previously looked
at the degree distribution but I haven't found a graph that exhibits degree skew under the
algorithm's optimization to generate triplets from the vertex with smallest degree. Would
be nice to have a proof, though.
    I expect most of the performance difference to be in `DegreeCounter` and `TriadBuilder`
caching objects but not supporting object reuse. Using immutable boxed primitives has the
same effect as disabling object reuse since deserialization must create a fresh object each

> Use Value types in Gelly API
> ----------------------------
>                 Key: FLINK-3277
>                 URL: https://issues.apache.org/jira/browse/FLINK-3277
>             Project: Flink
>          Issue Type: Improvement
>          Components: Gelly
>    Affects Versions: 1.0.0
>            Reporter: Greg Hogan
>            Assignee: Greg Hogan
> This would be a breaking change so the discussion needs to happen before the 1.0.0 release.
> I think it would benefit Flink to use {{Value}} types wherever possible. The {{Graph}}
functions {{inDegrees}}, {{outDegrees}}, and {{getDegrees}} each return {{DataSet<Tuple2<K,
Long>>}}. Using {{Long}} creates a new heap object for every serialization and deserialization.
The mutable {{Value}} types do not suffer from this issue when object reuse is enabled.
> I lean towards a preference for conciseness in documentation and performance in examples
and APIs.

This message was sent by Atlassian JIRA

View raw message