hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Shaw <derek.s...@rogers.com>
Subject Fwd: Collecting output not to file
Date Wed, 07 May 2008 23:38:10 GMT
To clarify:
 
     static class TestOutputFormat
         implements OutputFormat <Text, Text>
     {
         static class TestRecordWriter
             implements RecordWriter <Text, Text>
         {
             TestOutputFormat output;
             
             public TestRecordWriter (TestOutputFormat output, org.apache.hadoop.fs.FileSystem
ignored, JobConf job, String name, Progressable progress)
             {
                 this.output = output;
             }
             
             public void close (Reporter reporter)
             {}
             
             public void write (Text key, Text value)
             {
                 output.addResults (value.toString ());
             }
         }
         
         protected String results = "";
                 
         public void checkOutputSpecs (org.apache.hadoop.fs.FileSystem ignored, JobConf job)
             throws IOException
         {}
         
         public RecordWriter <Text, Text> getRecordWriter (org.apache.hadoop.fs.FileSystem
ignored, JobConf job, String name, Progressable progress)
         {
             return new TestRecordWriter (this, ignored, job, name, progress);
         }
         
         public void addResults (String r)
         {
             results += r + ",";
         }
         
         public String getResults ()
         {
             return results;
         }
     }

 And then running the task:
 public int run(String[] args) 
         throws Exception 
     {
     ....
     JobClient.runJob(job);
         
     // getOutputFormatcreates a new instance of the outputformat. I want to get the instance
of the output format that the reduce function wrote to
 // The recordWriter that reduce wrote to would be just as good
         TestOutputFormat results = (TestOutputFormat) job.getOutputFormat ();  
   
 // Always prints the empty string, not the populated results
         System.out.println ("results: " + results.getResults ());   
         
         return 0;
     }

Derek Shaw <derek.shaw@rogers.com> wrote: Date: Tue, 6 May 2008 23:26:30 -0400 (EDT)
From: Derek Shaw <derek.shaw@rogers.com>
Subject: Collecting output not to file
To: core-user@hadoop.apache.org

 Hey,

>From the examples that I have seen thus far, all of the results from the reduce function
are being written to a file. Instead of writing results to a file, I want to store them and
inspect them after the job is completed. (I think that I need to implement my own OutputCollector,
but I don't know how to tell hadoop to use it.) How can I do this?

-Derek


Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message