Return-Path: X-Original-To: apmail-crunch-user-archive@www.apache.org Delivered-To: apmail-crunch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CA6311D24 for ; Wed, 11 Jun 2014 14:23:41 +0000 (UTC) Received: (qmail 53332 invoked by uid 500); 11 Jun 2014 14:23:41 -0000 Delivered-To: apmail-crunch-user-archive@crunch.apache.org Received: (qmail 53289 invoked by uid 500); 11 Jun 2014 14:23:41 -0000 Mailing-List: contact user-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@crunch.apache.org Delivered-To: mailing list user@crunch.apache.org Received: (qmail 53281 invoked by uid 99); 11 Jun 2014 14:23:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jun 2014 14:23:41 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of stoffe@gmail.com designates 209.85.212.176 as permitted sender) Received: from [209.85.212.176] (HELO mail-wi0-f176.google.com) (209.85.212.176) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jun 2014 14:23:37 +0000 Received: by mail-wi0-f176.google.com with SMTP id n3so5450942wiv.15 for ; Wed, 11 Jun 2014 07:23:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=+Uuq6S5pGT6Xn7oQKfmnKPdqrwFq1/vdiO1fdMGPkz4=; b=YZd2IwgOtMLCwabzZjF8fMnhCuBh8BqkdVk5n7htkGC90tUBx93IJdcwPvvcnkl/49 2CLa5xiooE6K5Wno9/yN+SS9NCHdbUHFg5Hr1eUiz9cN+JFtpPAdsXBtRf4Gze+Muw+J T3vdIY3LNbAZWp2v8HuDN20zNfI4G8BczsIV9B2MpJC+by4NMoKOF1myPDTP1Eg3fkVq t/VwXzepGGsdwbofk2pCaqaTNJvrGNbipwd7LLmJeqfR2clIKFteKFt658GbrvwV5CmM HJC9vxyRIU8nXGxkaJeSxbbMPsR6+9ToQQCKcX1GcKf/BIu8Ott21wMehInqqM6e46cQ SJfg== MIME-Version: 1.0 X-Received: by 10.180.77.70 with SMTP id q6mr47915123wiw.28.1402496587420; Wed, 11 Jun 2014 07:23:07 -0700 (PDT) Received: by 10.194.238.165 with HTTP; Wed, 11 Jun 2014 07:23:07 -0700 (PDT) Date: Wed, 11 Jun 2014 16:23:07 +0200 Message-ID: Subject: CDH5 From: =?UTF-8?Q?Kristoffer_Sj=C3=B6gren?= To: user@crunch.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Hi Im trying out Crunch on YARN on CDH5 (0.9.0-cdh5.0.0) and get some errors when trying to materialize results (see below). The job itself is super simple. PCollection lines = pipeline.read(new TextFileSource( new Path("hdfs://....log"), Writables.strings())); lines = lines.parallelDo(new DoFn() { @Override public void process(String s, Emitter e) { e.emit(s); } }, Writables.strings()); for (String line : lines.materialize()) { System.out.println(line); } Seems like there's some kind of sync issue here because I can see the "correct" tmp dir in hdfs. Note that the p index is "p2" in hdfs while the client looks for "p1". -rw-r--r-- 1 kristoffersjogren supergroup 1748 2014-06-11 15:36 /tmp/crunch-134908575/p2/MAP drwxr-xr-x - kristoffersjogren supergroup 0 2014-06-11 15:36 /tmp/crunch-134908575/p2/output -rw-r--r-- 1 kristoffersjogren supergroup 0 2014-06-11 15:36 /tmp/crunch-134908575/p2/output/_SUCCESS -rw-r--r-- 1 kristoffersjogren supergroup 42898831 2014-06-11 15:36 /tmp/crunch-134908575/p2/output/out0-m-00000 -rw-r--r-- 1 kristoffersjogren supergroup 0 2014-06-11 15:36 /tmp/crunch-134908575/p2/output/part-m-00000 If I try to write directly to HDFS using the following, the job finish successfully, but nothing is written instead? pipeline.write(lines, new TextFileSourceTarget("/user/stoffe", Writables.strings()), WriteMode.OVERWRITE); Any ideas of what might go wrong? Cheers, -Kristoffer Exception in thread "main" java.lang.RuntimeException: org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files found to materialize at: /tmp/crunch-1611606737/p1 at mapred.CrunchJob.(CrunchJob.java:36) at mapred.tempjobs.DownloadFiles.(DownloadFiles.java:16) at mapred.tempjobs.DownloadFiles.main(DownloadFiles.java:20) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134) Caused by: org.apache.crunch.CrunchRuntimeException: java.io.IOException: No files found to materialize at: /tmp/crunch-1611606737/p1 at org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:79) at org.apache.crunch.materialize.MaterializableIterable.iterator(MaterializableIterable.java:69) at mapred.tempjobs.DownloadFiles.run(DownloadFiles.java:37) at mapred.CrunchJob.run(CrunchJob.java:96) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at mapred.CrunchJob.(CrunchJob.java:34) ... 7 more Caused by: java.io.IOException: No files found to materialize at: /tmp/crunch-1611606737/p1 at org.apache.crunch.io.CompositePathIterable.create(CompositePathIterable.java:49) at org.apache.crunch.io.impl.FileSourceImpl.read(FileSourceImpl.java:136) at org.apache.crunch.io.seq.SeqFileSource.read(SeqFileSource.java:43) at org.apache.crunch.io.impl.ReadableSourcePathTargetImpl.read(ReadableSourcePathTargetImpl.java:37) at org.apache.crunch.materialize.MaterializableIterable.materialize(MaterializableIterable.java:76) ... 12 more