beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jesse Anderson (JIRA)" <>
Subject [jira] [Created] (BEAM-645) Running Wordcount in Spark Checks Locally and Outputs in HDFS
Date Tue, 20 Sep 2016 18:25:20 GMT
Jesse Anderson created BEAM-645:

             Summary: Running Wordcount in Spark Checks Locally and Outputs in HDFS
                 Key: BEAM-645
             Project: Beam
          Issue Type: Bug
          Components: runner-spark
    Affects Versions: 0.3.0-incubating
            Reporter: Jesse Anderson
            Assignee: Amit Sela

When running the Wordcount example with the Spark runner, the Spark runner uses the input
file in HDFS. When the program performs its startup checks, it looks for the file in the local

To workaround this issue, you have to create a file in the local filesystem and put the actual
file in HDFS.

Here is the stack trace when the file doesn't exist in the local filesystem:
{quote}Exception in thread "main" java.lang.IllegalStateException: Unable to find any files
matching Macbeth.txt
	at org.apache.beam.sdk.runners.PipelineRunner.apply(
	at org.apache.beam.runners.spark.SparkRunner.apply(
	at org.apache.beam.sdk.Pipeline.applyInternal(
	at org.apache.beam.sdk.Pipeline.applyTransform(
	at org.apache.beam.sdk.values.PBegin.apply(
	at org.apache.beam.sdk.Pipeline.apply(
	at org.apache.beam.examples.WordCount.main(
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(
	at java.lang.reflect.Method.invoke(
	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

This message was sent by Atlassian JIRA

View raw message