Return-Path: X-Original-To: apmail-spark-commits-archive@minotaur.apache.org Delivered-To: apmail-spark-commits-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AF1F718D16 for ; Wed, 11 Nov 2015 21:29:34 +0000 (UTC) Received: (qmail 44239 invoked by uid 500); 11 Nov 2015 21:29:34 -0000 Delivered-To: apmail-spark-commits-archive@spark.apache.org Received: (qmail 44211 invoked by uid 500); 11 Nov 2015 21:29:34 -0000 Mailing-List: contact commits-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list commits@spark.apache.org Received: (qmail 44202 invoked by uid 99); 11 Nov 2015 21:29:34 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Nov 2015 21:29:34 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 806D5DFB02; Wed, 11 Nov 2015 21:29:34 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: tdas@apache.org To: commits@spark.apache.org Message-Id: <68e2a4a4097e4b70ad9c1e14a3a6784b@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: spark git commit: [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD Date: Wed, 11 Nov 2015 21:29:34 +0000 (UTC) Repository: spark Updated Branches: refs/heads/master a9a6b80c7 -> dd77e278b [SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD tdas koeninger This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access the offsets of a `KafkaRDD` through Python. Author: Nick Evans Closes #9289 from manygrams/update_kafka_direct_python_docs. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/dd77e278 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/dd77e278 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/dd77e278 Branch: refs/heads/master Commit: dd77e278b99e45c20fdefb1c795f3c5148d577db Parents: a9a6b80 Author: Nick Evans Authored: Wed Nov 11 13:29:30 2015 -0800 Committer: Tathagata Das Committed: Wed Nov 11 13:29:30 2015 -0800 ---------------------------------------------------------------------- docs/streaming-kafka-integration.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/dd77e278/docs/streaming-kafka-integration.md ---------------------------------------------------------------------- diff --git a/docs/streaming-kafka-integration.md b/docs/streaming-kafka-integration.md index ab7f011..b00351b 100644 --- a/docs/streaming-kafka-integration.md +++ b/docs/streaming-kafka-integration.md @@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application. );
- Not supported yet + offsetRanges = [] + + def storeOffsetRanges(rdd): + global offsetRanges + offsetRanges = rdd.offsetRanges() + return rdd + + def printOffsetRanges(rdd): + for o in offsetRanges: + print "%s %s %s %s" % (o.topic, o.partition, o.fromOffset, o.untilOffset) + + directKafkaStream\ + .transform(storeOffsetRanges)\ + .foreachRDD(printOffsetRanges)
--------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org For additional commands, e-mail: commits-help@spark.apache.org