Return-Path: X-Original-To: apmail-spark-reviews-archive@minotaur.apache.org Delivered-To: apmail-spark-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C13111795C for ; Sat, 7 Nov 2015 16:17:38 +0000 (UTC) Received: (qmail 79328 invoked by uid 500); 7 Nov 2015 16:17:38 -0000 Delivered-To: apmail-spark-reviews-archive@spark.apache.org Received: (qmail 79302 invoked by uid 500); 7 Nov 2015 16:17:38 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 79282 invoked by uid 99); 7 Nov 2015 16:17:38 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Nov 2015 16:17:38 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 03EBBE0991; Sat, 7 Nov 2015 16:17:37 +0000 (UTC) From: mateiz To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark pull request: [SPARK-2629][STREAMING] Basic implementation o... Content-Type: text/plain Message-Id: <20151107161738.03EBBE0991@git1-us-west.apache.org> Date: Sat, 7 Nov 2015 16:17:37 +0000 (UTC) Github user mateiz commented on a diff in the pull request: https://github.com/apache/spark/pull/9256#discussion_r44213890 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/State.scala --- @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.streaming + +import scala.language.implicitConversions + +import org.apache.spark.annotation.Experimental + +/** + * :: Experimental :: + * Abstract class for getting and updating the tracked state in the `trackStateByKey` operation of + * a [[org.apache.spark.streaming.dstream.PairDStreamFunctions pair DStream]] (Scala) or a + * [[org.apache.spark.streaming.api.java.JavaPairDStream JavaPairDStream]] (Java). + * + * Scala example of using `State`: + * {{{ + * def trackStateFunc(key: String, data: Option[Int], wrappedState: State[Int]): Option[Int] = { + * + * // Check if state exists + * if (state.exists) { + * + * val existingState = wrappedState.get // Get the existing state + * + * val shouldRemove = ... // Decide whether to remove the state + * + * if (shouldRemove) { + * + * wrappedState.remove() // Remove the state + * + * } else { + * + * val newState = ... + * wrappedState(newState) // Set the new state + * + * } + * } else { + * + * val initialState = ... + * state.update(initialState) // Set the initial state + * + * } + * } + * + * }}} + * + * Java example: + * {{{ + * TODO(@zsxwing) + * }}} + */ +@Experimental +sealed abstract class State[S] { + + /** Whether the state already exists */ + def exists(): Boolean + + /** + * Get the state if it exists, otherwise it will throw `java.util.NoSuchElementException`. + * Check with `exists()` whether the state exists or not before calling `get()`. + * + * @throws java.util.NoSuchElementException If the state does not exist. + */ + def get(): S + + /** + * Update the state with a new value. + * + * State cannot be updated if it has been already removed (that is, `remove()` has already been + * called) or it is going to be removed due to timeout (that is, `isTimingOut()` is `true`). + * + * @throws java.lang.IllegalArgumentException If the state has already been removed, or is + * going to be removed + */ + def update(newState: S): Unit + + /** + * Remove the state if it exists. + * + * State cannot be updated if it has been already removed (that is, `remove()` has already been + * called) or it is going to be removed due to timeout (that is, `isTimingOut()` is `true`). + */ + def remove(): Unit + + /** + * Whether the state is timing out and going to be removed by the system after the current batch. + * This timeou can occur if timeout duration has been specified in the + * [[org.apache.spark.streaming.StateSpec StatSpec]] and the key has not received any new data + * for that timeout duration. + */ + def isTimingOut(): Boolean + + /** + * Get the state as an [[scala.Option]]. It will be `Some(state)` if it exists, otherwise `None`. + */ + @inline final def getOption(): Option[S] = if (exists) Some(get()) else None + + @inline final override def toString(): String = { + getOption.map { _.toString }.getOrElse("") + } +} + +private[streaming] +object State { + implicit def toOption[S](state: State[S]): Option[S] = state.getOption() +} --- End diff -- Why do we need this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org