hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Igor Tatarinov <>
Subject Re: Does the "GROUP BY" need to be strict in Hive?
Date Wed, 19 Oct 2011 20:44:22 GMT
Yes, MySQL is not SQL-standard compliant here but I really like this
feature. A lot of times I know that a group-by column is a key for some
other columns I am selecting. It seems pointless to add all those other
columns to the group-by list.

Other times, I use this 'hole' in MySQL to pick a (random) value for a
non-group-by column. In fact, MySQL returns the first value in each
grouping, so
SELECT pid, seller, price
  SELECT pid, seller, price
  FROM offers
  ORDER by pid, price) t

returns the lowest seller/price for each product id. (This is not a
documented feature though but it seems to work. It's safer to do the same
with variables or a (slow) join.)


On Wed, Oct 19, 2011 at 1:06 PM, Mark Kerzner <>wrote:

> Hi,
> I think I've isolated my Hive GROUP BY problem to this question,
> In Hive, the GROUP BY needs to be strict. I mean
> hive> select property_id["property_id"], log_timestamp from trans group by
> property_id["property_id"];
> FAILED: Error in semantic analysis: Line 1:35 Expression not in GROUP BY
> key log_timestamp
> and that error message means it.
> However, a similar quiery in MySQL
> SELECT ext_booking_id, product_id FROM offer_sets group by ext_booking_id;
> works just fine.
> HIve seems to be on the right about this, and MySQL should not allow this.
> What do you all think?
> Thank you,
> Mark

View raw message