hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Navis류승우 <navis....@nexr.com>
Subject Re: Possible memory leak with 0.13 and JDBC
Date Tue, 08 Jul 2014 07:39:52 GMT
Could you try "jmap -histo:live <pid>" and check hive objects which seemed
too many?

Thanks,
Navis


2014-07-07 22:22 GMT+09:00 jonas.partner <jonas.partner@opencredo.com>:

> Hi Benjamin,
> Unfortunately this was a really critical issue for us and I didn’t think
> we would find a fix in time so we switched  to generating a hive scripts
> programmatically then running that via an Oozie action which uses the Hive
> CLI.  This seems to create a stable solution although is a lot less
> convenient than JDBC for our use case.
>
> I hope to find some more time to look at this later in the week since JDBC
> would simplify the solution.  I would be very interested to hear if you
> make any progress.
>
> Regards
>
> Jonas
>
> On 7 July 2014 at 14:14:46, Benjamin Bowman (bbowman410@gmail.com
> <//bbowman410@gmail.com>) wrote:
>
> I believe I am having the same issue.  Hive 0.13 and Hadoop 2.4.  We had
> to increase the Hive heap to 4 GB which allows Hive to function for about
> 2-3 days.  After that point it has consumed the entire heap and becomes
> unresponsive and/or throws OOM exceptions.  We are using  Beeline and
> HiveServer 2 and connect via JDBC to the database tens of thousands of
> times a day.
>
> I have been working with a developer at Hortonworks to find a solution but
> we have not come up with anything yet.  Have you made any progress on this
> issue?
>
> Thanks,
> Benjamin
>
>
> On Thu, Jul 3, 2014 at 4:17 PM, jonas.partner <jonas.partner@opencredo.com
> > wrote:
>
>>  Hi Edward,
>>
>>  Thanks for the response.  Sorry I posted the wrong version. I also added
>> close  on the two result sets to the code taken from the wiki as below but
>> still the same problem.
>>
>>  Will try to run it through your kit at the weekend.  For the moment I
>> switched to running the statements as a script through the hive client (not
>> beeline) which seems stable even with hundreds of repetitions.
>>
>>  Regards
>>
>>  Jonas
>>
>>   public static void run() throws SQLException {
>>             try {
>>                 Class.forName(driverName);
>>             } catch (ClassNotFoundException e) {
>>                 // TODO Auto-generated catch block
>>                 e.printStackTrace();
>>                 System.exit(1);
>>             }
>>             //replace "hive" here with the name of the user the queries
>> should run as
>>             Connection con =
>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive",
>> "");
>>             Statement stmt = con.createStatement();
>>             String tableName = "testHiveDriverTable";
>>             stmt.execute("drop table if exists " + tableName);
>>             stmt.execute("create external table  " + tableName + " (key
>> int, value string)");
>>             // show tables
>>             String sql = "show tables '" + tableName + "'";
>>             System.out.println("Running: " + sql);
>>             ResultSet res = stmt.executeQuery(sql);
>>             if (res.next()) {
>>                 System.out.println(res.getString(1));
>>             }
>>              res.close();
>>              // describe table
>>             sql = "describe " + tableName;
>>             System.out.println("Running: " + sql);
>>             res = stmt.executeQuery(sql);
>>
>>             while (res.next()) {
>>                 System.out.println(res.getString(1) + "\t" +
>> res.getString(2));
>>             }
>>              res.close();
>>             stmt.close();
>>             con.close();
>>         }
>>
>>
>>
>> On 3 July 2014 at 21:05:25, Edward Capriolo (edlinuxguru@gmail.com
>> <//edlinuxguru@gmail.com>) wrote:
>>
>>    Not saying there is not a leak elswhere but
>> statement and resultset objects both have .close()
>>
>> Java 7 now allows you to autoclose
>> try (  Connection conn ...; Statement st = conn.createStatement() ){
>> something
>> }
>>
>>
>> On Thu, Jul 3, 2014 at 6:35 AM, jonas.partner <
>> jonas.partner@opencredo.com> wrote:
>>
>>>  We have been struggling to get a reliable system working where we
>>> interact with Hive over JDBC a lot.  The pattern we see is that everything
>>> starts ok but the memory used by the Hive server process grows over time
>>> and after some hundreds of operations we start to see exceptions.
>>>
>>>  To ensure there was nothing stupid in our code causing this I took the
>>> example code from the wiki page for Hive 2 clients and put that in a loop.
>>>  For us after about 80 runs we would see exceptions as below.
>>>
>>> 2014-04-21 07:31:02,251 ERROR [pool-5-thread-5]:
>>> server.TThreadPoolServer (TThreadPoolServer.java:run(215)) - Error occurred
>>> during processing of message.
>>>  java.lang.RuntimeException:
>>> org.apache.thrift.transport.TTransportException
>>> at
>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
>>> at
>>> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:189)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> at java.lang.Thread.run(Thread.java:744)
>>> Caused by: org.apache.thrift.transport.TTransportException
>>> at
>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>> at
>>> org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:178)
>>> at
>>> org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
>>> at
>>> org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:253)
>>> at
>>> org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
>>> at
>>> org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
>>>  ... 4 more
>>>
>>>  This is also sometimes accompanied by out of memory exceptions.
>>>
>>>
>>>  The code on the wiki did not close statements and adding that in
>>> changes the behaviour instead of exceptions things just lock up after a
>>> while and there is high CPU usage.
>>>
>>>  This looks similar to HIVE-5296
>>> <https://issues.apache.org/jira/browse/HIVE-5296> but that was fixed in
>>> 0.12 so should not be an issue in 0.13 I assume.  Issues fixed in 0.13.1
>>> don’t seem to relate to this either.  The only way to get Hive back up and
>>> running is to restart.
>>>
>>>  Before raising a JIRA I wanted to make sure I wasn’t missing something
>>> so any suggestions would be greatly appreciated.
>>>
>>>  Full code as below.
>>>
>>>  import java.sql.*;
>>>
>>>
>>> public class HiveOutOfMem {
>>>
>>>         private static String driverName =
>>> "org.apache.hive.jdbc.HiveDriver";
>>>
>>>
>>>         public static void main(String[] args) throws SQLException{
>>>             for(int i =0; i < 100000; i++){
>>>                 System.out.println("Run number " + i);
>>>                 run();
>>>             }
>>>         }
>>>
>>>         /**
>>>          * @param
>>>          * @throws SQLException
>>>          */
>>>         public static void run() throws SQLException {
>>>             try {
>>>                 Class.forName(driverName);
>>>             } catch (ClassNotFoundException e) {
>>>                 // TODO Auto-generated catch block
>>>                 e.printStackTrace();
>>>                 System.exit(1);
>>>             }
>>>             //replace "hive" here with the name of the user the queries
>>> should run as
>>>             Connection con =
>>> DriverManager.getConnection("jdbc:hive2://localhost:10000/default", "hive",
>>> "");
>>>             Statement stmt = con.createStatement();
>>>             String tableName = "testHiveDriverTable";
>>>             stmt.execute("drop table if exists " + tableName);
>>>             stmt.execute("create external table  " + tableName + " (key
>>> int, value string)");
>>>             // show tables
>>>             String sql = "show tables '" + tableName + "'";
>>>             System.out.println("Running: " + sql);
>>>             ResultSet res = stmt.executeQuery(sql);
>>>             if (res.next()) {
>>>                 System.out.println(res.getString(1));
>>>             }
>>>
>>>             // describe table
>>>             sql = "describe " + tableName;
>>>             System.out.println("Running: " + sql);
>>>             res = stmt.executeQuery(sql);
>>>             while (res.next()) {
>>>                 System.out.println(res.getString(1) + "\t" +
>>> res.getString(2));
>>>             }
>>>             //stmt.close();
>>>             con.close();
>>>         }
>>>
>>> }
>>>
>>
>>
>

Mime
View raw message