incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Hayes (JIRA)" <>
Subject [jira] [Created] (CRUNCH-166) NullPointerException when attempting to use Sort.sortPairs
Date Tue, 19 Feb 2013 23:41:14 GMT
Matthew Hayes created CRUNCH-166:

             Summary: NullPointerException when attempting to use Sort.sortPairs
                 Key: CRUNCH-166
             Project: Crunch
          Issue Type: Bug
          Components: Core
    Affects Versions: 0.5.0
         Environment: Hadoop 1.0.4
            Reporter: Matthew Hayes
            Assignee: Josh Wills

I'm attempting to count some strings and then order by the count descending.  My code effectively
looks like this:

PCollection<SomeType> records =;
PCollection<String> stringsToCount = records.parallelDo(
            new DoFn<SomeType, String>() {
                public void process(SomeType input,Emitter<String> emitter) {
                  if (input.getRecords() != null && input.getRecords().size() >
                    for (MyRecord record : input.getRecords())
PTable<String, Long> stats = Aggregate.count(stringsToCount);
PCollection<Pair<String, Long>> sortedStats = Sort.sortPairs(stats, new ColumnOrder(2,

The error I get is:

	at org.apache.crunch.lib.Sort$TupleWritableComparator.setConf(
	at org.apache.hadoop.util.ReflectionUtils.setConf(
	at org.apache.hadoop.util.ReflectionUtils.newInstance(
	at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(
	at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(
	at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(
	at org.apache.hadoop.mapred.MapTask.runNewMapper(
	at org.apache.hadoop.mapred.Child$
	at Method)
	at org.apache.hadoop.mapred.Child.main(

Note that the line numbers are shifted because I added some debugging and recompiled.  The
NullPointerException is thrown in TupleWritableComparator.setConf() here:

String[] columnOrderNames = ordering.split(",");

I suppose "crunch.ordering" is not set, and therefore ordering is null.  When I check the
conf in job tracker I also don't see this property set.

Am I doing something wrong?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message