crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: reporter of a Crunch job?
Date Tue, 07 Apr 2015 22:57:01 GMT
Hey Lucy,

Your MapFn should have access to a TaskInputOutputContext object via the
getContext() method, which supports the Reporter-related methods you would
want to implement via your CrunchReporter class. I wrote some code like
this to support using old mapred.* style API classes inside of Crunch
pipelines:

https://github.com/apache/crunch/blob/master/crunch-core/src/main/java/org/apache/crunch/lib/Mapred.java

Josh

On Tue, Apr 7, 2015 at 3:50 PM, Lucy Chen <lucychen2014fall@gmail.com>
wrote:

> Hi,
>
>      I implemented a MapFn ClassA, and in the map function I need call a
> function from third party's library. But its function needs an input
> parameters including a reporter of type Reporter, which is also a built-in
> class of their library. That's why I have a child class CrunchReporter. But
> I don't know when I create an instance of CurnchReporter, what's the
> reporter and log that I can use as the inputs.
>
>       I used Pig before. Its class EvalFunc has reporter and log as
> built-in field, org.apache.pig.EvalFunc.reporter and
> org.apache.pig.EvalFunc.log. I passed those fields for the input to new a
> PigReporter. For Crunch jobs, like DoFn or MapFn, how can I specify the
> inputs for the reporter and log, this.params = Params(params_input, ?
> reporter, ?log).
>
>    Thanks!
>
> Lucy
>
> public class *ClassA* extends MapFn< T, T>{
>
>  public transient Params params;
>
>  public ClassA(String params_input)
>
> {
>
> this.params = Params(params_input, ?reporter, ?log);
>
> }
>
> @Override
>
> public map()
>
> {
>
> .....
>
> }
>
>
> }
>
>
> public class Params implements java.io.Serializable{
>
>
>     private CrunchReporter reporter = null;
>
>     .....
>
>
>       public LSRParams(String input, Progressable crunch_reporter, Log
> log)
>
> {
>
>              ..........
>
>              this.reporter = new CrunchReporter(crunch_reporter, log);
>
>
>         }
>
> }
>
>
> public class CrunchReporter extends Reporter {
>
>  Progressable crunchReporter = null;
>
> PrintStream out = null;
>
> Log log = null;
>
>  public CrunchReporter(Progressable crunchReporter, Log log)
>
> {
>
> super(LogLevel.DEBUG);
>
> this.crunchReporter = crunchReporter;
>
> this.log = log;
>
>  }
>
> }
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
View raw message