hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Klaas Bosteels (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HADOOP-4304) Add Dumbo to contrib
Date Mon, 16 Feb 2009 09:09:02 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Klaas Bosteels resolved HADOOP-4304.
------------------------------------

    Resolution: Later

I am closing this issue (for now) because HADOOP-1722 renders it mostly obsolete. Since typed
bytes provide a cleaner and more efficient alternative to the java code that is included in
dumbo right now, we will be able to remove all java code from dumbo and make it a pure python
module again. Consequently, future dumbo versions will be very easy to install and use, which
makes adding dumbo to contrib less of requirement. Maybe we can reconsider including dumbo
in contrib once it is more stable and feature complete, but right now [github|http://github.com/klbostee/dumbo]
seems to be a more convenient place to develop it further...

> Add Dumbo to contrib
> --------------------
>
>                 Key: HADOOP-4304
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4304
>             Project: Hadoop Core
>          Issue Type: New Feature
>            Reporter: Klaas Bosteels
>            Assignee: Klaas Bosteels
>            Priority: Minor
>         Attachments: hadoop-4304-v2.patch, hadoop-4304-v3.patch, hadoop-4304.patch
>
>
> Originally, Dumbo was a simple Python module developed at Last.fm to make writing and
running Hadoop Streaming programs very easy, but now it also consists of some (up till now
unreleased) helper code in Java (although it can still be used without the Java code). We
propose to add Dumbo to "src/contrib" such that the Java classes get build/installed together
with the rest of Hadoop, and the Python module can be installed separately at will. A tar.gz
of the directory that would have to be added to "src/contrib" is available at
> http://static.last.fm/dumbo/dumbo-contrib.tar.gz
> and more info about Dumbo can be found here:
> * Basic documentation: http://github.com/klbostee/dumbo/wikis
> * Presentation at HUG (where it was first suggested to add Dumbo to contrib): http://skillsmatter.com/podcast/home/dumbo-hadoop-streaming-made-elegant-and-easy
> * Initial announcement: http://blog.last.fm/2008/05/29/python-hadoop-flying-circus-elephant
> For some of the more advanced features of Dumbo (in particular the ones for which the
Java classes are needed) there is no public documentation yet, but we could easily fill that
gap by moving some of the internal Last.fm documentation to the Hadoop wiki.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message