hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Silberstein <>
Subject Re: limit clause + fetch optimization
Date Wed, 22 Jul 2015 16:38:26 GMT
Thanks Gopal.  I filed an issue to cover JDBC+setMaxRows:

For your first offer of testing a patch, unfortunately we tend to run our
production software on customers' Hadoop clusters, so we can't easily patch
their Hive instances.  But I'll still take you up on that if I find some
time to try it.


On Tue, Jul 21, 2015 at 11:14 PM, Gopal Vijayaraghavan <>

> > Just want to make sure I understand the behavior once that bug is
> >fixed...a 'select *' with no limit will run without a M/R job and instead
> >stream.  Is that correct?
> Yes, that¹s the intended behaviour. I can help you get a fix in, if you
> have some time to test out my WIP patches.
> > That may incidently solve another bug I'm seeing: when you use JDBC
> >templates to set the limit (setMaxRows in Spring in my setup), it does
> >not avoid the M/R job (and no limit clause appears in the hive-server2
> >log).  Instead, the M/R job gets launched...I'm
> > not sure if the jdbc framework subsequently would apply a limit, once
> >the job finishes.  I haven't spotted this issue in JIRA, I'd be happy to
> >file it if that's useful to you.
> File a JIRA, would be very useful for me.
> There¹s a lot of low-hanging fruit in the JDBC + Prepared Statement
> codepath, so going over the issues & filing your findings would help me
> pick up and knock them off one by one when I¹m back.
> Prasanth¹s github has some automated benchmarking tools for JDBC, which I
> use heavily -
> There are some known issues which have a 2-3x perf degradation for the
> simple query patterns you¹re running, like -
> Cheers,
> Gopal

View raw message