Return-Path: X-Original-To: apmail-spark-reviews-archive@minotaur.apache.org Delivered-To: apmail-spark-reviews-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C8761762C for ; Sat, 11 Oct 2014 10:05:36 +0000 (UTC) Received: (qmail 69798 invoked by uid 500); 11 Oct 2014 10:05:35 -0000 Delivered-To: apmail-spark-reviews-archive@spark.apache.org Received: (qmail 69771 invoked by uid 500); 11 Oct 2014 10:05:35 -0000 Mailing-List: contact reviews-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list reviews@spark.apache.org Received: (qmail 69760 invoked by uid 99); 11 Oct 2014 10:05:35 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 11 Oct 2014 10:05:35 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id 4C29881EFA8; Sat, 11 Oct 2014 10:05:35 +0000 (UTC) From: wangxiaojing To: reviews@spark.apache.org Reply-To: reviews@spark.apache.org References: In-Reply-To: Subject: [GitHub] spark pull request: [spark-3586][streaming]Support nested director... Content-Type: text/plain Message-Id: <20141011100535.4C29881EFA8@tyr.zones.apache.org> Date: Sat, 11 Oct 2014 10:05:35 +0000 (UTC) Github user wangxiaojing commented on a diff in the pull request: https://github.com/apache/spark/pull/2765#discussion_r18740834 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/dstream/FileInputDStream.scala --- @@ -118,6 +119,18 @@ class FileInputDStream[K: ClassTag, V: ClassTag, F <: NewInputFormat[K,V] : Clas (newFiles, filter.minNewFileModTime) } + def getPathList( path:Path, fs:FileSystem):List[Path]={ + val filter = new SubPathFilter() + var pathList = List[Path]() + fs.listStatus(path,filter).map(x=>{ + if(x.isDirectory()){ --- End diff -- Yes,because this only support subdirectories,because nested all the directories,processing time is too long --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org For additional commands, e-mail: reviews-help@spark.apache.org