Return-Path: X-Original-To: apmail-crunch-user-archive@www.apache.org Delivered-To: apmail-crunch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9B66211341 for ; Fri, 12 Sep 2014 11:02:10 +0000 (UTC) Received: (qmail 53013 invoked by uid 500); 12 Sep 2014 11:02:10 -0000 Delivered-To: apmail-crunch-user-archive@crunch.apache.org Received: (qmail 52972 invoked by uid 500); 12 Sep 2014 11:02:10 -0000 Mailing-List: contact user-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@crunch.apache.org Delivered-To: mailing list user@crunch.apache.org Received: (qmail 52962 invoked by uid 99); 12 Sep 2014 11:02:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Sep 2014 11:02:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of samikr@gmail.com designates 209.85.192.177 as permitted sender) Received: from [209.85.192.177] (HELO mail-pd0-f177.google.com) (209.85.192.177) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Sep 2014 11:01:44 +0000 Received: by mail-pd0-f177.google.com with SMTP id y10so990195pdj.36 for ; Fri, 12 Sep 2014 04:01:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=KdKxAvwm+4Gm3Iyy24ugHKGSFByjlJyomk39oqVoK88=; b=VEXzTc+gRO/y8e8Br22m0vPXGVDKC5cx3rygIfisqheDhw52IjP8wOKOCbMvRMSuHx 6fHcjHOSu9mK18FZpnHfTUnQfuT6z15KzCz9RqxB56G0I81bgPRbYSyqH3t/uNC7LqO5 owEYfFn58rpEfONczNdkZySd+SvBgxcRvzcPPi19XUIhGkVCJoZRzq495NUKFOQFP7FR mE0aWK9izxzM5f/UpJglvntgvU/f1kVVskVAxVynCECMIfagRt+nSpHm5ig1lPkOujhV h0yuhkAwliTS5R6YucMnWgrWWH2qVFFUUz9j6KRVNgMneTIMcH23+hTY6gJwCA/wz+jZ jQdQ== X-Received: by 10.66.152.41 with SMTP id uv9mr10554136pab.25.1410519702110; Fri, 12 Sep 2014 04:01:42 -0700 (PDT) Received: from [172.30.4.223] ([121.243.120.234]) by mx.google.com with ESMTPSA id pn13sm3584503pdb.53.2014.09.12.04.01.39 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 12 Sep 2014 04:01:41 -0700 (PDT) Message-ID: <5412D290.8040800@gmail.com> Date: Fri, 12 Sep 2014 16:31:36 +0530 From: Samik Raychaudhuri User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: user@crunch.apache.org Subject: Re: NPE when trying to run job References: <5412BC92.9090400@gmail.com> In-Reply-To: Content-Type: multipart/alternative; boundary="------------020200020405000709090105" X-Virus-Checked: Checked by ClamAV on apache.org This is a multi-part message in MIME format. --------------020200020405000709090105 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Hi Gabriel, Thanks for the response. I am running with JDK 8u20 x64 (most recent), on Win 7 x64. Directly running the program from IntelliJ (the editor I use for these stuff). Another interesting part is that the HDFS is accessed through an SSH double tunnel, but that shouldn't cause a problem - since I can run other hadoop commands from a Cygwin command prompt just fine. Here is the configuration part of the java program: // Create a configuration object to load specific configurations. Configuration conf = new Configuration(); conf.set("hadoop.socks.server", "localhost:8020"); conf.set("hadoop.rpc.socket.factory.class.default", "org.apache.hadoop.net.SocksSocketFactory"); conf.set("fs.default.name", "hdfs://namenode01.xxx.net:8020"); // Use a local folder as the temporary folder. conf.set("hadoop.tmp.dir", "/tmp"); Let me know if anything looks suspicious. Also, thanks for clarifying that the first snippet returns the total size (in bytes?), rather than the event count I was thinking I am getting. Regards. -Samik On 12/09/2014 4:20 PM, Gabriel Reid wrote: > Hi Samik, > > Thanks for your thorough description. Which operating system and JDK > are you running on the client? > > By my read of the Hadoop & JDK code (although the line numbers don't > match up exactly), the issue is occurring when a shell command is > about to be run to set the permissions on a newly-created local > staging directory. It looks like one of the parameters to the shell > command (probably the path) is null, although I don't yet see how that > could happen. > > The reason your first job is running properly (where you just call > getSize() on the returned PCollection) is that it's not actually doing > anything via MapReduce -- instead, it's just calculating the total > file size of all input files (PCollection#getSize() returns the byte > size, not the count of elements in the PCollection). > > Are you able to run a "standard" mapreduce job or pig job or something > similar from that client? > > - Gabriel > > > On Fri, Sep 12, 2014 at 11:27 AM, Samik Raychaudhuri wrote: >> Hello, >> I am able to run a count of a PCollection from a bunch of avro files just >> fine, but when I try to execute an MR job on the PCollection, I am getting >> an NPE. >> >> The following runs fine: >> >> PCollection events = >> pipeline.read(From.avroFile("/raw/*.avro", Avros.specifics(Event.class))); >> PipelineResult result = pipeline.done(); >> System.out.println("Event count: " + events.getSize()); >> >> And I get the events count. >> But the following doesn't (methods from a bunch of POJO from the avro schema >> is used here): >> >> PCollection events = >> pipeline.read(From.avroFile("/raw/*.avro", Avros.specifics(Event.class))); >> // Now create a PTable based on client and event type. Also have a >> long for counting purpose. >> PTable, Long> eventsByClient = >> events.parallelDo( >> new MapFn, Long>>() >> { >> @Override >> public Pair, Long> map(Event event) >> { >> String eventType = >> event.getBody().getTypeSpecificBody().getBody().getClass().getName(); >> eventType = >> eventType.substring(eventType.lastIndexOf('.') + 1); >> return Pair.of(Pair.of(event.getHeader().getClientId(), >> eventType), 1L); >> } >> }, Avros.tableOf(Avros.pairs(Avros.strings(), Avros.strings()), >> Avros.longs()) >> ); >> PTable, Long> eventCountsByClient = >> eventsByClient.groupByKey().combineValues(Aggregators.SUM_LONGS()); >> pipeline.writeTextFile(eventCountsByClient, "/user/samikr/output"); >> PipelineResult result = pipeline.done(); >> >> I am getting the following exception: >> >> 1 job failure(s) occurred: >> Collect Data Info: Avro(/raw/... ID=1 (1/1)(1): >> java.lang.NullPointerException >> at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012) >> at org.apache.hadoop.util.Shell.runCommand(Shell.java:482) >> at org.apache.hadoop.util.Shell.run(Shell.java:455) >> at >> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702) >> at org.apache.hadoop.util.Shell.execCommand(Shell.java:791) >> at org.apache.hadoop.util.Shell.execCommand(Shell.java:774) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646) >> at >> org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:434) >> at >> org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281) >> at >> org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125) >> at >> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:348) >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285) >> at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:422) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) >> at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282) >> at >> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:329) >> at >> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:204) >> at >> org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:238) >> at >> org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112) >> at >> org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55) >> at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83) >> at java.lang.Thread.run(Thread.java:745) >> >> Not sure what is causing the NPE though. From the stack trace, it looks like >> it is some permission issue. I have checked the "hadoop.tmp.dir" and it seem >> to have write permission etc., and I have also noticed that a folder named >> "samik.r1802905367" gets created for the job within that directory. I have >> tried giving one specific avro file in pipeline.read rather than *.avro, but >> that results in the same exception. Using hadoop 2.5.0, avro 1.7.7 and >> crunch 0.10.0-hadoop2 on the client side and CDH5 (2.3.0) on the server >> side. >> >> Any pointers? >> Regards. --------------020200020405000709090105 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi Gabriel,

Thanks for the response.

I am running with JDK 8u20 x64 (most recent), on Win 7 x64. Directly running the program from IntelliJ (the editor I use for these stuff). Another interesting part is that the HDFS is accessed through an SSH double tunnel, but that shouldn't cause a problem - since I can run other hadoop commands from a Cygwin command prompt just fine.

Here is the configuration part of the java program:

        // Create a configuration object to load specific configurations.
        Configuration conf = new Configuration();
        conf.set("hadoop.socks.server", "localhost:8020");
        conf.set("hadoop.rpc.socket.factory.class.default", "org.apache.hadoop.net.SocksSocketFactory");
        conf.set("fs.default.name", "hdfs://namenode01.xxx.net:8020");
        // Use a local folder as the temporary folder.
        conf.set("hadoop.tmp.dir", "/tmp");

Let me know if anything looks suspicious.
Also, thanks for clarifying that the first snippet returns the total size (in bytes?), rather than the event count I was thinking I am getting.

Regards.
-Samik

On 12/09/2014 4:20 PM, Gabriel Reid wrote:
Hi Samik,

Thanks for your thorough description. Which operating system and JDK
are you running on the client?

By my read of the Hadoop & JDK code (although the line numbers don't
match up exactly), the issue is occurring when a shell command is
about to be run to set the permissions on a newly-created local
staging directory. It looks like one of the parameters to the shell
command (probably the path) is null, although I don't yet see how that
could happen.

The reason your first job is running properly (where you just call
getSize() on the returned PCollection) is that it's not actually doing
anything via MapReduce -- instead, it's just calculating the total
file size of all input files (PCollection#getSize() returns the byte
size, not the count of elements in the PCollection).

Are you able to run a "standard" mapreduce job or pig job or something
similar from that client?

- Gabriel


On Fri, Sep 12, 2014 at 11:27 AM, Samik Raychaudhuri <samikr@gmail.com> wrote:
Hello,
I am able to run a count of a PCollection from a bunch of avro files just
fine, but when I try to execute an MR job on the PCollection, I am getting
an NPE.

The following runs fine:

        PCollection<Event> events =
pipeline.read(From.avroFile("/raw/*.avro", Avros.specifics(Event.class)));
        PipelineResult result = pipeline.done();
        System.out.println("Event count: " + events.getSize());

And I get the events count.
But the following doesn't (methods from a bunch of POJO from the avro schema
is used here):

        PCollection<Event> events =
pipeline.read(From.avroFile("/raw/*.avro", Avros.specifics(Event.class)));
        // Now create a PTable based on client and event type. Also have a
long for counting purpose.
        PTable<Pair<String, String>, Long> eventsByClient =
events.parallelDo(
            new MapFn<Event, Pair<Pair<String, String>, Long>>()
            {
                @Override
                public Pair<Pair<String, String>, Long> map(Event event)
                {
                    String eventType =
event.getBody().getTypeSpecificBody().getBody().getClass().getName();
                    eventType =
eventType.substring(eventType.lastIndexOf('.') + 1);
                    return Pair.of(Pair.of(event.getHeader().getClientId(),
eventType), 1L);
                }
            }, Avros.tableOf(Avros.pairs(Avros.strings(), Avros.strings()),
Avros.longs())
        );
        PTable<Pair<String, String>, Long> eventCountsByClient =
eventsByClient.groupByKey().combineValues(Aggregators.SUM_LONGS());
        pipeline.writeTextFile(eventCountsByClient, "/user/samikr/output");
        PipelineResult result = pipeline.done();

I am getting the following exception:

1 job failure(s) occurred:
Collect Data Info: Avro(/raw/... ID=1 (1/1)(1):
java.lang.NullPointerException
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1012)
    at org.apache.hadoop.util.Shell.runCommand(Shell.java:482)
    at org.apache.hadoop.util.Shell.run(Shell.java:455)
    at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:702)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:791)
    at org.apache.hadoop.util.Shell.execCommand(Shell.java:774)
    at
org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646)
    at
org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:434)
    at
org.apache.hadoop.fs.FilterFileSystem.mkdirs(FilterFileSystem.java:281)
    at
org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:125)
    at
org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:348)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1282)
    at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchControlledJob.submit(CrunchControlledJob.java:329)
    at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.startReadyJobs(CrunchJobControl.java:204)
    at
org.apache.crunch.hadoop.mapreduce.lib.jobcontrol.CrunchJobControl.pollJobStatusAndStartNewOnes(CrunchJobControl.java:238)
    at
org.apache.crunch.impl.mr.exec.MRExecutor.monitorLoop(MRExecutor.java:112)
    at
org.apache.crunch.impl.mr.exec.MRExecutor.access$000(MRExecutor.java:55)
    at org.apache.crunch.impl.mr.exec.MRExecutor$1.run(MRExecutor.java:83)
    at java.lang.Thread.run(Thread.java:745)

Not sure what is causing the NPE though. From the stack trace, it looks like
it is some permission issue. I have checked the "hadoop.tmp.dir" and it seem
to have write permission etc., and I have also noticed that a folder named
"samik.r1802905367" gets created for the job within that directory. I have
tried giving one specific avro file in pipeline.read rather than *.avro, but
that results in the same exception. Using hadoop 2.5.0, avro 1.7.7 and
crunch 0.10.0-hadoop2 on the client side and CDH5 (2.3.0) on the server
side.

Any pointers?
Regards.

--------------020200020405000709090105--