hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Shaw <derek.s...@rogers.com>
Subject Fwd: Collecting output not to file
Date Wed, 07 May 2008 23:38:10 GMT
To clarify:
     static class TestOutputFormat
         implements OutputFormat <Text, Text>
         static class TestRecordWriter
             implements RecordWriter <Text, Text>
             TestOutputFormat output;
             public TestRecordWriter (TestOutputFormat output, org.apache.hadoop.fs.FileSystem
ignored, JobConf job, String name, Progressable progress)
                 this.output = output;
             public void close (Reporter reporter)
             public void write (Text key, Text value)
                 output.addResults (value.toString ());
         protected String results = "";
         public void checkOutputSpecs (org.apache.hadoop.fs.FileSystem ignored, JobConf job)
             throws IOException
         public RecordWriter <Text, Text> getRecordWriter (org.apache.hadoop.fs.FileSystem
ignored, JobConf job, String name, Progressable progress)
             return new TestRecordWriter (this, ignored, job, name, progress);
         public void addResults (String r)
             results += r + ",";
         public String getResults ()
             return results;

 And then running the task:
 public int run(String[] args) 
         throws Exception 
     // getOutputFormatcreates a new instance of the outputformat. I want to get the instance
of the output format that the reduce function wrote to
 // The recordWriter that reduce wrote to would be just as good
         TestOutputFormat results = (TestOutputFormat) job.getOutputFormat ();  
 // Always prints the empty string, not the populated results
         System.out.println ("results: " + results.getResults ());   
         return 0;

Derek Shaw <derek.shaw@rogers.com> wrote: Date: Tue, 6 May 2008 23:26:30 -0400 (EDT)
From: Derek Shaw <derek.shaw@rogers.com>
Subject: Collecting output not to file
To: core-user@hadoop.apache.org


>From the examples that I have seen thus far, all of the results from the reduce function
are being written to a file. Instead of writing results to a file, I want to store them and
inspect them after the job is completed. (I think that I need to implement my own OutputCollector,
but I don't know how to tell hadoop to use it.) How can I do this?


  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message