spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Burak Yavuz (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-14287) Method to determine if Dataset is bounded or not
Date Thu, 31 Mar 2016 04:05:25 GMT

     [ https://issues.apache.org/jira/browse/SPARK-14287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Burak Yavuz updated SPARK-14287:
--------------------------------
    Summary: Method to determine if Dataset is bounded or not  (was: isStreaming method for
Dataset)

> Method to determine if Dataset is bounded or not
> ------------------------------------------------
>
>                 Key: SPARK-14287
>                 URL: https://issues.apache.org/jira/browse/SPARK-14287
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL, Streaming
>            Reporter: Burak Yavuz
>
> With the addition of StreamExecution (ContinuousQuery) to Datasets, data will become
unbounded. With unbounded data, the execution of some methods and operations will not make
sense, e.g. Dataset.count().
> A simple API is required to check whether the data in a Dataset is bounded or unbounded.
This will allow users to check whether their Dataset is in streaming mode or not. ML algorithms
may check if the data is unbounded and throw an exception for example.
> The implementation of this method is simple, however naming it is the challenge. Some
possible names for this method are:
>  - isStreaming
>  - isContinuous
>  - isBounded
>  - isUnbounded



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message