hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12547) Deprecate hadoop-pipes
Date Wed, 04 Nov 2015 19:48:27 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990266#comment-14990266

Chris Nauroth commented on HADOOP-12547:

Some of this discussion has not been constructive.  I urge everyone to stick to the technical
points of the debate.

I'm still weighing this, but I have a few other points to mention for consideration.

Part of the argument presented here for deprecation/removal is that development has halted.
 It's worth noting that the flow of patches for MapReduce itself has slowed significantly
since completion of YARN/MRv2.  By extension, a C++ wrapper over MapReduce is going to see
even fewer contributions.  I don't think patch count alone is a sufficient measure to justify
the elimination (or the existence) of a component.

I have no direct experience with my users using hadoop-pipes, but I also don't see it as a
hindrance to maintain if someone like Yahoo does find it useful.  Another part of the argument
for removal was reduced build times.  I do not see this component causing a significant delay
in build times though.  Granted, that's partly due to the lack of tests.

A more telling problem is the lack of tests.  Maybe I'm mistaken, but has the documentation
vanished too?  These are gaps that don't speak well to the long-term viability of the component.
 If we cannot come to consensus on removal, then we need to commit to filling those gaps.

As a matter of process, I disagree with adding libwebhdfs as a rider to this proposal.  I
don't think the two are in a comparable state.  However, I do agree that libwebhdfs is a much
more viable candidate for removal.  We have evidence that Pipes was at least used by someone
at some time, worked correctly, and satsified its design goals.  I don't believe we have any
evidence that anyone has ever used libwebhdfs, it still doesn't build properly in recent releases,
and it does not satisfy its design goal of providing a library with no JVM dependency.  (This
can be viewed as just a bug, but there is also not overwhelming support for bothering to fix

> Deprecate hadoop-pipes
> ----------------------
>                 Key: HADOOP-12547
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12547
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>            Priority: Minor
> Development appears to have stopped on hadoop-pipes upstream for the last few years,
aside from very basic maintenance.  Hadoop streaming seems to be a better alternative, since
it supports more programming languages and is better implemented.
> There were no responses to a message on the mailing list asking for users of Hadoop pipes...
and in my experience, I have never seen anyone use this.  We should remove it to reduce our
maintenance burden and build times.

This message was sent by Atlassian JIRA

View raw message