Return-Path: X-Original-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Delivered-To: apmail-incubator-crunch-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67382E3A1 for ; Thu, 21 Feb 2013 22:38:14 +0000 (UTC) Received: (qmail 30693 invoked by uid 500); 21 Feb 2013 22:38:13 -0000 Delivered-To: apmail-incubator-crunch-dev-archive@incubator.apache.org Received: (qmail 30629 invoked by uid 500); 21 Feb 2013 22:38:13 -0000 Mailing-List: contact crunch-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: crunch-dev@incubator.apache.org Delivered-To: mailing list crunch-dev@incubator.apache.org Received: (qmail 30411 invoked by uid 99); 21 Feb 2013 22:38:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Feb 2013 22:38:13 +0000 Date: Thu, 21 Feb 2013 22:38:13 +0000 (UTC) From: "Matthew Hayes (JIRA)" To: crunch-dev@incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CRUNCH-166) NullPointerException when attempting to use Sort.sortPairs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CRUNCH-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583622#comment-13583622 ] Matthew Hayes commented on CRUNCH-166: -------------------------------------- Looking in job tracker, here are the properties in the configuration for the job which failed starting with "crunch." crunch.work.dir crunch.reflectdatafactory crunch.outputs.dir crunch.debug > NullPointerException when attempting to use Sort.sortPairs > ---------------------------------------------------------- > > Key: CRUNCH-166 > URL: https://issues.apache.org/jira/browse/CRUNCH-166 > Project: Crunch > Issue Type: Bug > Components: Core > Affects Versions: 0.5.0 > Environment: Hadoop 1.0.4 > Reporter: Matthew Hayes > Assignee: Josh Wills > Attachments: WordSortIT.java > > > I'm attempting to count some strings and then order by the count descending. My code effectively looks like this: > {code} > PCollection records = pipeline.read(...); > PCollection stringsToCount = records.parallelDo( > new DoFn() { > @Override > public void process(SomeType input,Emitter emitter) { > if (input.getRecords() != null && input.getRecords().size() > 0) > { > for (MyRecord record : input.getRecords()) > { > emitter.emit(record.getValue().toString()); > } > } > } > }, > Writables.strings() > ); > PTable stats = Aggregate.count(stringsToCount); > PCollection> sortedStats = Sort.sortPairs(stats, new ColumnOrder(2, Order.DESCENDING)); > pipeline.writeTextFile(sortedStats,"somewhere"); > {code} > The error I get is: > {code} > java.lang.NullPointerException > at org.apache.crunch.lib.Sort$TupleWritableComparator.setConf(Sort.java:459) > at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62) > at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117) > at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:773) > at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.java:959) > at org.apache.hadoop.mapred.MapTask$NewOutputCollector.(MapTask.java:674) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {code} > Note that the line numbers are shifted because I added some debugging and recompiled. The NullPointerException is thrown in TupleWritableComparator.setConf() here: > {code} > String[] columnOrderNames = ordering.split(","); > {code} > I suppose "crunch.ordering" is not set, and therefore ordering is null. When I check the conf in job tracker I also don't see this property set. > Am I doing something wrong? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira