beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <j...@apache.org>
Subject [jira] [Created] (BEAM-2267) Final files for WordCount not appearing with Apex on YARN
Date Thu, 11 May 2017 22:40:04 GMT
Kenneth Knowles created BEAM-2267:
-------------------------------------

             Summary: Final files for WordCount not appearing with Apex on YARN
                 Key: BEAM-2267
                 URL: https://issues.apache.org/jira/browse/BEAM-2267
             Project: Beam
          Issue Type: Bug
          Components: runner-apex
            Reporter: Kenneth Knowles
            Assignee: Thomas Weise


When I run WordCount with the Apex runner on a YARN cluster - specifically Dataproc, reading/writing
GCS - the word counts are all written to temporary files but they are never moved to their
final destination.

Hadoop version 2.7.3
Beam RC 2.0.0

Steps to repro:

1. Instantiate archetype (see below)
2. Build uber jar {{mvn --settings ../beamrc-settings.xml clean package -P apex-runner}}
3. SCP to master (or wherever you'd like to launch from)
4. {{java -cp word-count-beam-0.1.jar beamrc.WordCount --runner=ApexRunner --embeddedExecution=false
--inputfile=gs://apache-beam-samples/shakespeare/winterstale-personae --output=SOMEWHERE}}

Appendix: steps to instantiate RC archetype:

Build an RC-specific {{beamrc-settings.xml}}
{code}
<settings>
  <profiles>
    <profile>
      <id>beam-2.0.0</id>
      <repositories>
        <repository>
          <!-- This id _must_ be "archetype" -->
          <id>archetype</id>
          <url>RC_REPO</url>
        </repository>
      </repositories>
    </profile>
  </profiles>
 
  <activeProfiles>
    <activeProfile>beam-2.0.0</activeProfile>
  </activeProfiles>
</settings>
{code}

And then instantiate like so
{code}
mvn archetype:generate \
      --settings beam-rc-settings.xml \
      -D archetypeCatalog=internal \
      -D archetypeGroupId=org.apache.beam \
      -D archetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
      -D archetypeVersion=2.0.0 \
      -D groupId=beamrc \
      -D artifactId=word-count-beam \
      -D version="0.1" \
      -D package=beamrc \
      -D interactiveMode=false
{code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message