Return-Path: X-Original-To: apmail-cordova-issues-archive@minotaur.apache.org Delivered-To: apmail-cordova-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id EFBA31749E for ; Mon, 10 Nov 2014 09:59:33 +0000 (UTC) Received: (qmail 39210 invoked by uid 500); 10 Nov 2014 09:59:33 -0000 Delivered-To: apmail-cordova-issues-archive@cordova.apache.org Received: (qmail 39186 invoked by uid 500); 10 Nov 2014 09:59:33 -0000 Mailing-List: contact issues-help@cordova.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@cordova.apache.org Received: (qmail 39127 invoked by uid 99); 10 Nov 2014 09:59:33 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 10 Nov 2014 09:59:33 +0000 Date: Mon, 10 Nov 2014 09:59:33 +0000 (UTC) From: "maji2014 (JIRA)" To: issues@cordova.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Closed] (CB-7998) Exception throws when finding new files like intermediate result(_COPYING_ file) through textFileStream interface MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CB-7998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maji2014 closed CB-7998. ------------------------ Resolution: Invalid The issue doesn't exist in the project > Exception throws when finding new files like intermediate result(_COPYING_ file) through textFileStream interface > ----------------------------------------------------------------------------------------------------------------- > > Key: CB-7998 > URL: https://issues.apache.org/jira/browse/CB-7998 > Project: Apache Cordova > Issue Type: Bug > Reporter: maji2014 > > [Reproduce] > 1. Run HdfsWordCount interface, such as "ssc.textFileStream(args(0))" > 2. Upload file to hdfs(reason as followings) > 3. Exception as followings. > [Exception stack] > 14/11/10 01:21:19 DEBUG Client: IPC Client (842425021) connection to master/192.168.84.142:9000 from ocdc sending #13 > 14/11/10 01:21:19 ERROR JobScheduler: Error generating jobs for time 1415611274000 ms > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:9000/user/spark/200._COPYING_ > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285) > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:125) > at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:124) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at org.apache.spark.streaming.dstream.FileInputDStream.org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD(FileInputDStream.scala:124) > at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:83) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) > at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) > at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.immutable.List.foreach(List.scala:318) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:40) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.ShuffledDStream.compute(ShuffledDStream.scala:41) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) > at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) > at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) > at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:115) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) > at scala.util.Try$.apply(Try.scala:161) > at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:221) > at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:165) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$start$1$$anon$1$$anonfun$receive$1.applyOrElse(JobGenerator.scala:76) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > Exception in thread "main" 14/11/10 01:21:19 DEBUG Client: IPC Client (842425021) connection to master/192.168.84.142:9000 from ocdc got value #13 > org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:9000/user/spark/200._COPYING_ > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285) > at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) > at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:125) > at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:124) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at org.apache.spark.streaming.dstream.FileInputDStream.org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD(FileInputDStream.scala:124) > at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:83) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) > at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) > at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) > at scala.collection.immutable.List.foreach(List.scala:318) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) > at scala.collection.AbstractTraversable.map(Traversable.scala:105) > at org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:40) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.ShuffledDStream.compute(ShuffledDStream.scala:41) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) > at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) > at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) > at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) > at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) > at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) > at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) > at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:115) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) > at scala.util.Try$.apply(Try.scala:161) > at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:221) > at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:165) > at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$start$1$$anon$1$$anonfun$receive$1.applyOrElse(JobGenerator.scala:76) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 14/11/10 01:21:19 DEBUG ProtobufRpcEngine: Call: getListing took 3ms > [Reason] > Upload intermediate file 200._COPYING_ is found by FileInputDStream interface, and exception throws when NewHadoopRDD ready to handle non-existent 200._COPYING_ file because file 200._COPYING_ is changed to file 200 when upload is finished -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@cordova.apache.org For additional commands, e-mail: issues-help@cordova.apache.org