pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex McLintock (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2803) Include Wonderdog (ElasticSearch Integration) in contrib/
Date Sun, 17 Feb 2013 09:53:13 GMT

    [ https://issues.apache.org/jira/browse/PIG-2803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13580158#comment-13580158

Alex McLintock commented on PIG-2803:

Russ, where would be a good place to discuss this? Is it time to use a different tool for
transferring data from Hadoop to elastic search? Is it something that is fixable with some
decent java dev time?
> Include Wonderdog (ElasticSearch Integration) in contrib/
> ---------------------------------------------------------
>                 Key: PIG-2803
>                 URL: https://issues.apache.org/jira/browse/PIG-2803
>             Project: Pig
>          Issue Type: New Feature
>          Components: tools
>    Affects Versions: 0.10.0, 0.11, 0.10.1
>         Environment: contrib/ github
>            Reporter: Russell Jurney
>            Assignee: Russell Jurney
>            Priority: Critical
>              Labels: contrib, elasticsearch, fun, happy, integration, pants, pig, udf
> I propose to add Wonderdog to Pig contrib/
> Wonderdog is an Apache 2.0 licensed project that adds Hadoop and Pig integration for
ElasticSearch. This lets you index any Pig relation with a single UDF call, which is very
powerful. Both writing searchable indexes and loading based on search queries is supported.
> More information on Wonderdog is available at https://github.com/infochimps-labs/wonderdog
and a great introduction to ElasticSearch is available at http://www.elasticsearchtutorial.com/elasticsearch-in-5-minutes.html
> Wonderdog broke in Pig 0.10.0, and was patched to work here: https://github.com/infochimps-labs/wonderdog/pull/9
Even still, there is the issue of Pig creating schema files when storing and loading JSON
that must be manually removed to make Wonderdog go.
> Moving forward, I would like the Pig project to maintain Wonderdog in contrib/ and verify
that it works with each version increment. Wonderdog is an incredibly useful library that
is license compatible with Pig itself. Along with ElasticSearch, it adds the ability for any
user to index his Pig relations and to load subsets of data by pushing search queries down
to ElasticSearch.
> I use Wonderdog in production and in my book, so I volunteer to do the maintenance on

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message