incubator-bigtop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Roman Shaposhnik (Updated) (JIRA)" <>
Subject [jira] [Updated] (BIGTOP-358) now that hadoop packages have been split we have to update the dependencies on the downstream packages
Date Fri, 27 Jan 2012 19:52:09 GMT


Roman Shaposhnik updated BIGTOP-358:


Attached dot and png files are what I figured so far (rectangle boxes represent capabilities
that will be provided by actual packages and dotted lines represent "optional/recommended"
dependencies). Now, I still have a few concerns:

1. I think it is pretty clear by now that mapreduce dependency has to be on a capability,
not an actual package (and then we'll have hadoop-mapreduce "Provide: " that capability. The
question is whether we are ready to do the same with hadoop-hdfs and what those capabilities
should be called (my proposal is to call them "mapreduce" and "dfs" respectively and make
the actual packages hadoop-mapreduce and hadoop-hdfs provide those capabilities for now).

2. For pig, hive,sqoop and mahout the real hard dependency is mapreduce. The dependency on
dfs is an optional one (they can run just fine in local mode without ever talking to HDFS).
The question is -- what's the best mechanism to "recommend" dfs? I know we can do that with
debian packages (Recommends tag), but what about RPM? Finally, are we doing the right thing
here by treating dfs as an optional dependency or should we enforce it to begin with?

3. HBase is a weird case here -- at the Maven level they package all of their dependencies
(optional or not) into lib/* they end up with a whole bunch of jars there that we're currently
replacing by symlinks. Not all of those dependencies are needed by HBase in all cases
(in fact the only hard dependency there is Zookeeper) but having dangling symlinks doesn't
seem appealing. The question is -- what do we do?
> now that hadoop packages have been split we have to update the dependencies on the downstream
> ------------------------------------------------------------------------------------------------------
>                 Key: BIGTOP-358
>                 URL:
>             Project: Bigtop
>          Issue Type: Bug
>            Reporter: Roman Shaposhnik
>            Assignee: Roman Shaposhnik
>         Attachments:, bigtop.png
> This is actually slightly more complicated than it sounds: it is pretty straightforward
to replace a dependency on hadoop with a dependency on hadoop-mapreduce it is less clear what
to do with HDFS. Strictly speaking HDFS is not a hard dependency (one can run on a local filesystems
just fine).
> Thoughts?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message