hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Busbey (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-11804) POC Hadoop Client w/o transitive dependencies
Date Thu, 03 Nov 2016 08:49:59 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-11804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Sean Busbey updated HADOOP-11804:
    Attachment: HADOOP-11804.5.patch


  - update to current master (0e75496)
  - includes patch for HADOOP-13789 so precommit can run
  - incorporate review feedback
  - exclude slf4j backend from hadoop-client pom
  - leave runtime dependency on htrace, slf4j-api, commons-logging, and log4j

I made the log4j dependency optional, since there are plenty of use cases where a downstream
user wouldn't need it (it's needed for MapReduce and to use some custom Log4j appenders we

Yea, this was a mystery to me too. Could be PEBKAC, in which case I'd appreciate fuller build

I don't know if something changed or if I was just building incorrectly before. When I use
the build given by [~sjlee0] I easily reproduced this issue. Since it's a general problem
(including test classes in two places) I filed HADOOP-13789 to fix it.

One high level concern is in terms of maintaining dependencies in the pom's. If a developer
adds a new dependency to a module, how would that propagate to these client pom's? Would he/she
need to add it to these client pom's for the most part? It wasn't entirely clear to me what
that cost of maintenance is. If that is the only way to keep it clean, that's OK. But it would
be great if that cost is kept to a minimum.

If a developer adds a new dependency that impacts clients, then the only place they should
have to update is the shaded minicluster to make sure the dependency is only included in one
of the three shaded artifacts. At the moment, this is the only way to make sure a given class
only ends up in one place. Ideally we'd reduce the long term cost by adding a custom shader
that can exclude classes that appear in a given dependency. I'm not sure how long creating
that shader will take, so I didn't do it yet.

1. patch doesn't apply
2. no mortbay.jetty version
3. duplicate jetty classes

These were all fallout from the update to jetty 9. They should all be cleaned up in this rebase.

4. Was there a significant difficulty in handing the timeline service v.2? Is it just the
number of new dependencies we’re pulling in or the fact that there is a HBase dependency?

the volume of dependencies caused me some concern, but HBase getting pulled in by default
was the blocker for me. Since I'm trying to test HBase as a downstream application having
a shaded hbase dependency included (for a feature that is off by default) seems ill advised.

Adding it back in for downstream folks that need it should be straight-forward, since it is
included as an optional dependency. they just need to add the artifact as a test dependency.

> POC Hadoop Client w/o transitive dependencies
> ---------------------------------------------
>                 Key: HADOOP-11804
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11804
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: build
>            Reporter: Sean Busbey
>            Assignee: Sean Busbey
>         Attachments: HADOOP-11804.1.patch, HADOOP-11804.2.patch, HADOOP-11804.3.patch,
HADOOP-11804.4.patch, HADOOP-11804.5.patch
> make a hadoop-client-api and hadoop-client-runtime that i.e. HBase can use to talk with
a Hadoop cluster without seeing any of the implementation dependencies.
> see proposal on parent for details.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message