hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Silberstein <a...@trifacta.com>
Subject limit clause + fetch optimization
Date Wed, 22 Jul 2015 01:36:08 GMT
Hi,
I've been experimenting with 'select *' and 'select * limit X' in beeline
and watching the hive-server2 log to understand when a M/R job is triggered
and when not.  It seems like whenever I set a limit, the job is avoided,
but with no limit, it is run.

I found this param:
hive.limit.optimize.fetch.max

That defaults to 50,000 and as I understand it, whenever I set limit to
above that number, a job should be triggered.  But I can set limit to
something very high (e.g. 10M) and no job runs.

If anyone has some insight into how this param is used or expected behavior
of the fetch optimization, would appreciate it.

This is on Hive 1.1 inside CDH5.4.

Thanks,
Adam

Mime
View raw message