hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gopal Vijayaraghavan <gop...@apache.org>
Subject Re: limit clause + fetch optimization
Date Wed, 22 Jul 2015 06:14:26 GMT

> Just want to make sure I understand the behavior once that bug is
>fixed...a 'select *' with no limit will run without a M/R job and instead
>stream.  Is that correct?

Yes, that¹s the intended behaviour. I can help you get a fix in, if you
have some time to test out my WIP patches.

> That may incidently solve another bug I'm seeing: when you use JDBC
>templates to set the limit (setMaxRows in Spring in my setup), it does
>not avoid the M/R job (and no limit clause appears in the hive-server2
>log).  Instead, the M/R job gets launched...I'm
> not sure if the jdbc framework subsequently would apply a limit, once
>the job finishes.  I haven't spotted this issue in JIRA, I'd be happy to
>file it if that's useful to you.

File a JIRA, would be very useful for me.

There¹s a lot of low-hanging fruit in the JDBC + Prepared Statement
codepath, so going over the issues & filing your findings would help me
pick up and knock them off one by one when I¹m back.

Prasanth¹s github has some automated benchmarking tools for JDBC, which I
use heavily - https://github.com/prasanthj/jmeter-hiveserver2/tree/llap


There are some known issues which have a 2-3x perf degradation for the
simple query patterns you¹re running, like -
https://issues.apache.org/jira/browse/HIVE-10982

Cheers,
Gopal



Mime
View raw message