flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Theodore Vasiloudis (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-2202) Calling distinct() requires tuple input
Date Wed, 10 Jun 2015 15:47:00 GMT
Theodore Vasiloudis created FLINK-2202:

             Summary: Calling distinct() requires tuple input
                 Key: FLINK-2202
                 URL: https://issues.apache.org/jira/browse/FLINK-2202
             Project: Flink
          Issue Type: Improvement
          Components: Core, Scala API
            Reporter: Theodore Vasiloudis
            Priority: Minor

Currently to call distinct on a DataSet the elements must be placed in a tuple.

This creates the need to write code like the following:

val doubleDS: DataSet[Double] = ...
val uniqueDS = doubleDS.map( el => Tuple1(el)).distinct().map(t => t._1)

which looks quite unnecessary. Ideally we would like to just have to write:
val uniqueDS = doubleDS.distinct()

which should be possible as long as there exists an implicit {{Ordering\[T\]}} for a {{DataSet\[T\]}}

This message was sent by Atlassian JIRA

View raw message