hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Coveney (JIRA)" <>
Subject [jira] [Commented] (HIVE-1107) Generic parallel execution framework for Hive (and Pig, and ...)
Date Thu, 07 Jun 2012 17:45:23 GMT


Jonathan Coveney commented on HIVE-1107:

Pig committers coming out of the woodwork :)

Keren: I really like this idea in the abstract, and have talked with many people about it.
It's on everyone's mind.

That said, I agree completely with dmitriy. Proving that you can pipe one random unified operator
through Pig and Hive isn't going to prove very much. The hard part is going to be creating
a system generic enough to handle the diverse object models, extension API's (UDF's, load
funcs, etc), as well as the act of decoupling highly pig or hive specific code from their
respective logical plans. Obviously if you just do a "load + foreach," it's going to be much
easier than building a system that can handle the extensibility people count on.

Godspeed. I'll definitely read anything you guys propose. CC the pig listserv :)
> Generic parallel execution framework for Hive (and Pig, and ...)
> ----------------------------------------------------------------
>                 Key: HIVE-1107
>                 URL:
>             Project: Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Carl Steinbach
> Pig and Hive each have their own libraries for handling plan execution. As we prepare
to invest more time improving Hive's plan execution mechanism we should also start to consider
ways of building a generic plan execution mechanism that is capable of supporting the needs
of Hive and Pig, as well as other Hadoop data flow programming environments. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message