hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Bowman <bbowman...@gmail.com>
Subject Re: Possible memory leak with 0.13 and JDBC
Date Mon, 07 Jul 2014 13:14:16 GMT
I believe I am having the same issue.  Hive 0.13 and Hadoop 2.4.  We had to
increase the Hive heap to 4 GB which allows Hive to function for about 2-3
days.  After that point it has consumed the entire heap and becomes
unresponsive and/or throws OOM exceptions.  We are using  Beeline and
HiveServer 2 and connect via JDBC to the database tens of thousands of
times a day.

I have been working with a developer at Hortonworks to find a solution but
we have not come up with anything yet.  Have you made any progress on this
issue?

Thanks,
Benjamin


On Thu, Jul 3, 2014 at 4:17 PM, jonas.partner <jonas.partner@opencredo.com>
wrote:

> Hi Edward,
>
> Thanks for the response.  Sorry I posted the wrong version. I also added
> close  on the two result sets to the code taken from the wiki as below but
> still the same problem.
>
> Will try to run it through your kit at the weekend.  For the moment I
> switched to running the statements as a script through the hive client (not
> beeline) which seems stable even with hundreds of repetitions.
>
> Regards
>
> Jonas
>
>  public static void run() throws SQLException {
>             try {
>                 Class.forName(driverName);
>             } catch (ClassNotFoundException e) {
>                 // TODO Auto-generated catch block
>                 e.printStackTrace();
>                 System.exit(1);
>             }
>             //replace "hive" here with the name of the user the queries
> should run as
>             Connection con =
> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive",
> "");
>             Statement stmt = con.createStatement();
>             String tableName = "testHiveDriverTable";
>             stmt.execute("drop table if exists " + tableName);
>             stmt.execute("create external table  " + tableName + " (key
> int, value string)");
>             // show tables
>             String sql = "show tables '" + tableName + "'";
>             System.out.println("Running: " + sql);
>             ResultSet res = stmt.executeQuery(sql);
>             if (res.next()) {
>                 System.out.println(res.getString(1));
>             }
>             res.close();
>             // describe table
>             sql = "describe " + tableName;
>             System.out.println("Running: " + sql);
>             res = stmt.executeQuery(sql);
>
>             while (res.next()) {
>                 System.out.println(res.getString(1) + "\t" +
> res.getString(2));
>             }
>             res.close();
>             stmt.close();
>             con.close();
>         }
>
>
>
> On 3 July 2014 at 21:05:25, Edward Capriolo (edlinuxguru@gmail.com
> <//edlinuxguru@gmail.com>) wrote:
>
>   Not saying there is not a leak elswhere but
> statement and resultset objects both have .close()
>
> Java 7 now allows you to autoclose
> try (  Connection conn ...; Statement st = conn.createStatement() ){
> something
> }
>
>
> On Thu, Jul 3, 2014 at 6:35 AM, jonas.partner <jonas.partner@opencredo.com
> > wrote:
>
>>  We have been struggling to get a reliable system working where we
>> interact with Hive over JDBC a lot.  The pattern we see is that everything
>> starts ok but the memory used by the Hive server process grows over time
>> and after some hundreds of operations we start to see exceptions.
>>
>>  To ensure there was nothing stupid in our code causing this I took the
>> example code from the wiki page for Hive 2 clients and put that in a loop.
>>  For us after about 80 runs we would see exceptions as below.
>>
>> 2014-04-21 07:31:02,251 ERROR [pool-5-thread-5]: server.TThreadPoolServer
>> (TThreadPoolServer.java:run(215)) - Error occurred during processing of
>> message.
>>  java.lang.RuntimeException:
>> org.apache.thrift.transport.TTransportException
>> at
>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>> at
>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
>> at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> Caused by: org.apache.thrift.transport.TTransportException
>> at
>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>> at
>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
>> at
>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>> at
>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
>> at
>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>> at
>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>>  ... 4 more
>>
>>  This is also sometimes accompanied by out of memory exceptions.
>>
>>
>>  The code on the wiki did not close statements and adding that in changes
>> the behaviour instead of exceptions things just lock up after a while and
>> there is high CPU usage.
>>
>>  This looks similar to HIVE-5296
>> <https://issues.apache.org/jira/browse/HIVE-5296> but that was fixed in
>> 0.12 so should not be an issue in 0.13 I assume.  Issues fixed in 0.13.1
>> don’t seem to relate to this either.  The only way to get Hive back up and
>> running is to restart.
>>
>>  Before raising a JIRA I wanted to make sure I wasn’t missing something
>> so any suggestions would be greatly appreciated.
>>
>>  Full code as below.
>>
>>  import java.sql.*;
>>
>>
>> public class HiveOutOfMem {
>>
>>         private static String driverName =
>> "org.apache.hive.jdbc.HiveDriver";
>>
>>
>>         public static void main(String[] args) throws SQLException{
>>             for(int i =0; i < 100000; i++){
>>                 System.out.println("Run number " + i);
>>                 run();
>>             }
>>         }
>>
>>         /**
>>          * @param
>>          * @throws SQLException
>>          */
>>         public static void run() throws SQLException {
>>             try {
>>                 Class.forName(driverName);
>>             } catch (ClassNotFoundException e) {
>>                 // TODO Auto-generated catch block
>>                 e.printStackTrace();
>>                 System.exit(1);
>>             }
>>             //replace "hive" here with the name of the user the queries
>> should run as
>>             Connection con =
>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive",
>> "");
>>             Statement stmt = con.createStatement();
>>             String tableName = "testHiveDriverTable";
>>             stmt.execute("drop table if exists " + tableName);
>>             stmt.execute("create external table  " + tableName + " (key
>> int, value string)");
>>             // show tables
>>             String sql = "show tables '" + tableName + "'";
>>             System.out.println("Running: " + sql);
>>             ResultSet res = stmt.executeQuery(sql);
>>             if (res.next()) {
>>                 System.out.println(res.getString(1));
>>             }
>>
>>             // describe table
>>             sql = "describe " + tableName;
>>             System.out.println("Running: " + sql);
>>             res = stmt.executeQuery(sql);
>>             while (res.next()) {
>>                 System.out.println(res.getString(1) + "\t" +
>> res.getString(2));
>>             }
>>             //stmt.close();
>>             con.close();
>>         }
>>
>> }
>>
>
>

Mime
View raw message