hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carl Steinbach (JIRA)" <>
Subject [jira] [Updated] (HIVE-835) Deprecate, remove, or fix MAP and REDUCE syntax.
Date Tue, 26 Jul 2011 23:27:10 GMT


Carl Steinbach updated HIVE-835:

    Component/s: SQL

> Deprecate, remove, or fix MAP and REDUCE syntax.
> ------------------------------------------------
>                 Key: HIVE-835
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: SQL
>            Reporter: Adam Kramer
> There are syntactic elements MAP and REDUCE which function as syntactic sugar for SELECT
TRANSFORM. This behavior is not at all intuitive, because no checking or verification is done
to ensure that the user's intention is met.
> Specifically, Hive may see a MAP query and simply tack the transform script on to the
end of a reduce job (so, the user says MAP but hive does a REDUCE), or (more dangerously)
vice-versa. Given that Hive's whole point is to sit on top of a mapreduce framework and allow
transformations in the mapper or reducer, it seems very inappropriate for Hive to ignore a
clear command from the user to MAP or to REDUCE the data using a script, and then simply ignore
> Better behavior would be for hive to see a MAP command and to start a new mapreduce step
and run the command in the mapper (even if it otherwise would be run in the reducer), and
for REDUCE to begin a reduce step if necessary (so, tack the REDUCE script on to the end of
a REDUCE job if the current system would do so, or if not, treat the 0th column as the reduce
key, throw a warning saying this has been done, and force a reduce job).
> Acceptable behavior would be to throw an error or warning when the user's clearly-stated
desire is going to be ignored. "Warning: User used MAP keyword, but transformation will occur
in the reduce phase" / "Warning: User used REDUCE keyword, but did not specify DISTRIBUTE
BY / CLUSTER BY column. Transformation will occur in the map phase."

This message is automatically generated by JIRA.
For more information on JIRA, see:


View raw message