Return-Path: X-Original-To: apmail-hive-issues-archive@minotaur.apache.org Delivered-To: apmail-hive-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 60FC019E5C for ; Fri, 22 Apr 2016 16:47:13 +0000 (UTC) Received: (qmail 71669 invoked by uid 500); 22 Apr 2016 16:47:13 -0000 Delivered-To: apmail-hive-issues-archive@hive.apache.org Received: (qmail 71647 invoked by uid 500); 22 Apr 2016 16:47:13 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 71632 invoked by uid 99); 22 Apr 2016 16:47:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Apr 2016 16:47:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C74472C1F4E for ; Fri, 22 Apr 2016 16:47:12 +0000 (UTC) Date: Fri, 22 Apr 2016 16:47:12 +0000 (UTC) From: "Chaoyu Tang (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HIVE-13527) Using deprecated APIs in HBase client causes zookeeper connection leaks. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-13527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254199#comment-15254199 ] Chaoyu Tang commented on HIVE-13527: ------------------------------------ +1 > Using deprecated APIs in HBase client causes zookeeper connection leaks. > ------------------------------------------------------------------------ > > Key: HIVE-13527 > URL: https://issues.apache.org/jira/browse/HIVE-13527 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 1.1.0 > Reporter: Naveen Gangam > Assignee: Naveen Gangam > Attachments: HIVE-13527.2.patch, HIVE-13527.2.patch, HIVE-13527.patch, HIVE-13527.patch > > > When running queries against hbase-backed hive tables, the following log messages are seen in the HS2 log. > {code} > 2016-04-11 07:25:23,657 WARN org.apache.hadoop.hbase.mapreduce.TableInputFormatBase: You are using an HTable instance that relies on an HBase-managed Connection. This is usually due to directly creating an HTable, which is deprecated. Instead, you should create a Connection object and then request a Table instance from it. If you don't need the Table instance for your own use, you should instead use the TableInputFormatBase.initalizeTable method directly. > 2016-04-11 07:25:23,658 INFO org.apache.hadoop.hbase.mapreduce.TableInputFormatBase: Creating an additional unmanaged connection because user provided one can't be used for administrative actions. We'll close it when we close out the table. > {code} > In a HS2 log file, there are 1366 zookeeper connections established but only a small fraction of them were closed. So lsof would show 1300+ open TCP connections to Zookeeper. > grep "org.apache.zookeeper.ClientCnxn: Session establishment complete on server" * |wc -l > 1366 > grep "INFO org.apache.zookeeper.ZooKeeper: Session:" * |grep closed |wc -l > 54 > According to the comments in TableInputFormatBase, the recommended means for subclasses like HiveHBaseTableInputFormat is to call initializeTable() instead of setHTable() that it currently uses. > " > Subclasses MUST ensure initializeTable(Connection, TableName) is called for an instance to function properly. Each of the entry points to this class used by the MapReduce framework, {@link #createRecordReader(InputSplit, TaskAttemptContext)} and {@link #getSplits(JobContext)}, will call {@link #initialize(JobContext)} as a convenient centralized location to handle retrieving the necessary configuration information. If your subclass overrides either of these methods, either call the parent version or call initialize yourself. > " > Currently setHTable() also creates an additional Admin connection, even though it is not needed. > So the use of deprecated APIs are to be replaced. -- This message was sent by Atlassian JIRA (v6.3.4#6332)