hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Prem Yadav <>
Subject Re: Hive query over JDBC not honoring fetch size
Date Wed, 19 Aug 2015 14:53:43 GMT
actually it should be something like

On Wed, Aug 19, 2015 at 3:49 PM, Prem Yadav <> wrote:

> Hi Emil,
> for either of the queries, there will be no mapreduce job. the query
> engine understands that in both case, it need not do any computation and
> just needs to fetch all the data from the files.
> The fetch size should be honored in both cases. Hope you are using
> hiveserver2.
> You can try connections using excel and cloudera's odbc driver with the
> required parameters for your testing. For each batch that hive returns, you
> should be able to see in hive lg something like: returning results for id
> <hash>
> On Wed, Aug 19, 2015 at 2:54 PM, Emil Berglind <>
> wrote:
>> I have a small Java app that I wrote that uses JDBC to run a hive query.
>> The Hive table that I'm running it against has 30+ million rows, and I want
>> to pull them all back to verify the data. If I run a simple "SELECT * FROM
>> <table>" and set a fetch size of 30,000 then the fetch size is not honored
>> and it seems to want to bring back all 30+ million rows at once, which is
>> definitely not going to work. If I set a LIMIT on the SQL, like "SELECT *
>> FROM <table> LIMIT 9999999", then it honors the fetch size just fine.
>> However, when I set the LIMIT on there, it does not run as a map reduce job
>> but rather seems to stream the data back. Is this how it's supposed to
>> work? I'm new to the Hadoop eco-system and I'm really just trying to figure
>> out what the best way to bring this data back in chunks is. Maybe I'm going
>> about this all wrong?

View raw message