incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CRUNCH-166) NullPointerException when attempting to use Sort.sortPairs
Date Fri, 22 Feb 2013 00:46:13 GMT

    [ https://issues.apache.org/jira/browse/CRUNCH-166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583754#comment-13583754
] 

Matthew Hayes commented on CRUNCH-166:
--------------------------------------

Hmm perhaps, let me dig through this some more.
                
> NullPointerException when attempting to use Sort.sortPairs
> ----------------------------------------------------------
>
>                 Key: CRUNCH-166
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-166
>             Project: Crunch
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: 0.5.0
>         Environment: Hadoop 1.0.4
>            Reporter: Matthew Hayes
>            Assignee: Josh Wills
>         Attachments: WordSortIT.java
>
>
> I'm attempting to count some strings and then order by the count descending.  My code
effectively looks like this:
> {code}
> PCollection<SomeType> records = pipeline.read(...);
> PCollection<String> stringsToCount = records.parallelDo(
>             new DoFn<SomeType, String>() {
>                 @Override
>                 public void process(SomeType input,Emitter<String> emitter) {
>                   if (input.getRecords() != null && input.getRecords().size()
> 0)
>                   {
>                     for (MyRecord record : input.getRecords())
>                     {
>                       emitter.emit(record.getValue().toString());
>                     }
>                   }                      
>                 }
>             },              
>             Writables.strings()
>         );
> PTable<String, Long> stats = Aggregate.count(stringsToCount);
> PCollection<Pair<String, Long>> sortedStats = Sort.sortPairs(stats, new ColumnOrder(2,
Order.DESCENDING));
> pipeline.writeTextFile(sortedStats,"somewhere");
> {code}
> The error I get is:
> {code}
> java.lang.NullPointerException
> 	at org.apache.crunch.lib.Sort$TupleWritableComparator.setConf(Sort.java:459)
> 	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
> 	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
> 	at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:773)
> 	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
> 	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
> 	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
> 	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> 	at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {code}
> Note that the line numbers are shifted because I added some debugging and recompiled.
 The NullPointerException is thrown in TupleWritableComparator.setConf() here:
> {code}
> String[] columnOrderNames = ordering.split(",");
> {code}
> I suppose "crunch.ordering" is not set, and therefore ordering is null.  When I check
the conf in job tracker I also don't see this property set.
> Am I doing something wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message