hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind A Bhandarkar" <mili...@yahoo-inc.com>
Subject Re: Re: Streaming --counters question
Date Wed, 11 Jun 2008 03:33:55 GMT
First of all +1 to the proposal.

Here is what can be done in this regard:

On stderr, if the output line is of the form:

$$$ Hadoop-counter: org.apache.hadoop.MyApp.CounterName=SomeNumber

(The dollars are just to distinguish from normal stderr outputs. Not indicating the cost of
feature development ;-))

Then it is treated as counter by hadoop straming framework, and communicated to tasktracker
as such, who then dutifully communicates to the job-tracker.

That's a nice idea. But as often happens in the open-source world, the core-developers are
going to say "Sure. This will be a great addition! Can you provide a patch?"

If you care about this feature, let me ask you preemptively. "Care to provide a patch ?"

- milind

----- Original Message -----
From: news <news@ger.gmane.org>
To: core-user@hadoop.apache.org <core-user@hadoop.apache.org>
Sent: Tue Jun 10 20:16:50 2008
Subject:  Re: Streaming --counters question

Streaming works on stdin and stdout so unless there was a way to capture the 
stdout as a counter I do not see any other way to report the to the 
jobtracker. Unless there was a url the task could call on the jobtracker to 
update counters.


"Miles Osborne" <miles@inf.ed.ac.uk> wrote in 
message news:73e5a5310806101516h5315eeadwdf90d59e315fc559@mail.gmail.com...
> Is there support for counters in streaming?  In particular, it would be 
> nice
> to be able to access these after a job has run.
> Thanks!
> Miles
> -- 
> The University of Edinburgh is a charitable body, registered in Scotland,
> with registration number SC005336.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message