incubator-crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: Logging/Debugging
Date Sat, 22 Sep 2012 15:22:25 GMT

On Sat, Sep 22, 2012 at 12:52 AM, Matthias Friedrich <> wrote:
> Hi,
> I'd like to discuss two things regarding logging and debugging:
> 1) Crunch currently ships a which can have precedence
> over users', depending on classpath order. Libraries
> should never ship logging config as it forces users to repackage
> Crunch if they want to use their own. Our Nexus at work has a nice
> collection of repackaged libs.

I'm on-board for this, with the caveat that we'll need to switch over
the logging we do now (e.g., which stage of the pipeline is currently
running, the URLs for job tracking, etc.) under a non-log4j based
control scheme.

> 2) Discussion about Pipeline.enableDebug() came up in CRUNCH-70. I
> believe it really shouldn't mess with logging configuration. Right now
> it bypasses the commons-logging facade and directly accesses log4j,
> causing a compile time dependency on log4j. It changes VM-wide state
> beyond Crunch as other Hadoop-related code executed afterwards will
> get changed logging config, too. And, most importantly, it's the
> responsibility of the operations team, not the developer to configure
> logging. Admins are used to, we shouldn't invent
> another non-standard way of doing things that overrides the usual
> way.

enableDebug is intended to be used by developers who are, well,
debugging their MR pipelines, which is something that every MR
developer spends a fair amount of time doing. I will argue against any
changes that make debugging more difficult-- if anything, I would like
to make it even easier. I don't draw a distinction between ops and
development when it comes to creating pipelines.

> My vote for 1) would be to remove our
> For 2) I think the best solution would be to offer an example
> in our documentation (section "Debugging Your
> Pipelines" or something) that has the effect Pipeline.enableDebug()
> has now.
> Regards,
>   Matthias

Director of Data Science
Twitter: @josh_wills

View raw message