incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CRUNCH-166) NullPointerException when attempting to use Sort.sortPairs
Date Tue, 19 Feb 2013 23:41:14 GMT
Matthew Hayes created CRUNCH-166:
------------------------------------

             Summary: NullPointerException when attempting to use Sort.sortPairs
                 Key: CRUNCH-166
                 URL: https://issues.apache.org/jira/browse/CRUNCH-166
             Project: Crunch
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.5.0
         Environment: Hadoop 1.0.4
            Reporter: Matthew Hayes
            Assignee: Josh Wills


I'm attempting to count some strings and then order by the count descending.  My code effectively
looks like this:

{code}
PCollection<SomeType> records = pipeline.read(...);
PCollection<String> stringsToCount = records.parallelDo(
            new DoFn<SomeType, String>() {
                @Override
                public void process(SomeType input,Emitter<String> emitter) {
                  if (input.getRecords() != null && input.getRecords().size() >
0)
                  {
                    for (MyRecord record : input.getRecords())
                    {
                      emitter.emit(record.getValue().toString());
                    }
                  }                      
                }
            },              
            Writables.strings()
        );
PTable<String, Long> stats = Aggregate.count(stringsToCount);
PCollection<Pair<String, Long>> sortedStats = Sort.sortPairs(stats, new ColumnOrder(2,
Order.DESCENDING));
pipeline.writeTextFile(sortedStats,"somewhere");
{code}

The error I get is:

{code}
java.lang.NullPointerException
	at org.apache.crunch.lib.Sort$TupleWritableComparator.setConf(Sort.java:459)
	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
	at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:773)
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:959)
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:674)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)
{code}

Note that the line numbers are shifted because I added some debugging and recompiled.  The
NullPointerException is thrown in TupleWritableComparator.setConf() here:

{code}
String[] columnOrderNames = ordering.split(",");
{code}

I suppose "crunch.ordering" is not set, and therefore ordering is null.  When I check the
conf in job tracker I also don't see this property set.

Am I doing something wrong?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message