hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayuran Yogarajah <mayuran.yogara...@casalemedia.com>
Subject Re: How best to collect userlogs (in a streaming world)
Date Mon, 28 Sep 2009 21:35:35 GMT
Dan Milstein wrote:
> Hadoop-folk,
> How have people gone about collecting debug/error log information from
> streaming jobs, in Hadoop?
> I'm clear that, if I write to stderr (and it's not a counter/status
> line), then it goes onto the node's local disk, in:
>   /var/log/hadoop/userlogs/<task atttempt>/stderr
> However, I'd really like to collect those in some central location,
> for processing.  Possibly via splunk (which we use right now),
> possibly some other means.
>   - Do people write a custom log4j appender?  (does log4j even control
> writes to that stderr file?  I can't tell -- it somewhat looks like no)
>   - Or, maybe write cron jobs that run on the slaves and periodically
> push logs somewhere?
>   - Are people outside of Facebook using scribe?
> Any ideas / experiences appreciated.
> Thanks,
> -Dan Milstein
We use remote syslog for this.  All warning/error messages get forwarded 
to a central
Log server.  This server writes these messages to a named pipe.  A 
separate script
reads from the named pipe and emails the errors to the admin.

I'd like to try out Scribe at some point, it looks neat.


View raw message