hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy A. Lucas (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-3574) Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
Date Fri, 12 Oct 2012 21:43:02 GMT

     [ https://issues.apache.org/jira/browse/HIVE-3574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jeremy A. Lucas updated HIVE-3574:
----------------------------------

    Description: 
The current behavior of the MapRedTask is to start a process that invokes the "hadoop jar"
command, passing each additional jobconf property as an argument to this Hadoop CLI.

Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for
potentially greater compatibility across platforms, in addition to allowing for these jobs
to be run easily against pseudo-clusters in tests (think MiniMRCluster).

This kind of change could involve something as simple as using a Hadoop Configuration object
with a generic ToolRunner or something similar to run jobs.

Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask.

  was:
Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow for
potentially greater compatibility across platforms, in addition to allowing for these jobs
to be run easily against pseudo-clusters in tests (think MiniMRCluster).

This kind of change could involve something as simple as using a Hadoop Configuration object
with a generic ToolRunner or something similar to run jobs.

Specifically, this kind of change would most likely occur in the execute() method of org.apache.hadoop.hive.ql.exec.MapRedTask.

    
> Allow Hive to Submit MapReduce jobs via the MapReduce API (instead of using Hadoop BIN)
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-3574
>                 URL: https://issues.apache.org/jira/browse/HIVE-3574
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor, SQL
>    Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0, 0.7.0, 0.7.1, 0.8.0, 0.8.1, 0.9.0,
0.10.0, 0.9.1
>         Environment: All environments would be affected by this
>            Reporter: Jeremy A. Lucas
>            Priority: Minor
>              Labels: feature, test
>
> The current behavior of the MapRedTask is to start a process that invokes the "hadoop
jar" command, passing each additional jobconf property as an argument to this Hadoop CLI.
> Having Hive to submit generated jobs to an M/R cluster via the MapReduce API would allow
for potentially greater compatibility across platforms, in addition to allowing for these
jobs to be run easily against pseudo-clusters in tests (think MiniMRCluster).
> This kind of change could involve something as simple as using a Hadoop Configuration
object with a generic ToolRunner or something similar to run jobs.
> Specifically, this kind of change would most likely occur in the execute() method of
org.apache.hadoop.hive.ql.exec.MapRedTask.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message