Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id AC759200C4E for ; Fri, 21 Apr 2017 22:16:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id AB0C0160B97; Fri, 21 Apr 2017 20:16:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F2494160B86 for ; Fri, 21 Apr 2017 22:15:59 +0200 (CEST) Received: (qmail 50779 invoked by uid 500); 21 Apr 2017 20:15:59 -0000 Mailing-List: contact dev-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list dev@hive.apache.org Received: (qmail 50768 invoked by uid 99); 21 Apr 2017 20:15:58 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Apr 2017 20:15:58 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 61166C142F; Fri, 21 Apr 2017 20:15:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 3.2 X-Spam-Level: *** X-Spam-Status: No, score=3.2 tagged_above=-999 required=6.31 tests=[HEADER_FROM_DIFFERENT_DOMAINS=0.001, HTML_MESSAGE=2, KAM_LAZY_DOMAIN_SECURITY=1, KAM_MANYTO=0.2, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id XgdAN_z6JFTf; Fri, 21 Apr 2017 20:15:56 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 5A4B15F5FD; Fri, 21 Apr 2017 20:15:56 +0000 (UTC) Received: from reviews.apache.org (unknown [10.41.0.12]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id C2925E0436; Fri, 21 Apr 2017 20:15:55 +0000 (UTC) Received: from reviews-vm2.apache.org (localhost [IPv6:::1]) by reviews.apache.org (ASF Mail Server at reviews-vm2.apache.org) with ESMTP id B1A33C400D6; Fri, 21 Apr 2017 20:15:55 +0000 (UTC) Content-Type: multipart/alternative; boundary="===============0605793623481637585==" MIME-Version: 1.0 Subject: Re: Review Request 57353: Intern Properties objects referenced from PartitionDesc to reduce memory pressure. From: Misha Dmitriev To: Sahil Takiar , Xuefu Zhang , Alan Gates , Rui Li , Chaozhong Yang , j.prasanth.j@gmail.com, Vihang Karajgaonkar , Sergio Pena Cc: hive , Misha Dmitriev Date: Fri, 21 Apr 2017 20:15:55 -0000 Message-ID: <20170421201555.23980.31091@reviews-vm2.apache.org> X-ReviewBoard-URL: https://reviews.apache.org/ Auto-Submitted: auto-generated Sender: Misha Dmitriev X-ReviewGroup: hive X-Auto-Response-Suppress: DR, RN, OOF, AutoReply X-ReviewRequest-URL: https://reviews.apache.org/r/57353/ X-Sender: Misha Dmitriev References: <20170307012208.40991.75510@reviews-vm2.apache.org> In-Reply-To: <20170307012208.40991.75510@reviews-vm2.apache.org> X-ReviewBoard-Diff-For: common/src/java/org/apache/hadoop/hive/common/CopyOnFirstWriteProperties.java Reply-To: Misha Dmitriev X-ReviewRequest-Repository: hive-git archived-at: Fri, 21 Apr 2017 20:16:00 -0000 --===============0605793623481637585== MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/57353/#review172706 ----------------------------------------------------------- common/src/java/org/apache/hadoop/hive/common/CopyOnFirstWriteProperties.java Lines 314 (patched) Oh, you are right, I didn't realize that google.common.collect.MapMaker that this thing uses internally always returns a ConcurrentMap, which is equivalent to ConcurrentHashMap in terms of thread safety. So, no need for extra synchronization here. Fixed. - Misha Dmitriev On March 7, 2017, 1:22 a.m., Misha Dmitriev wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/57353/ > ----------------------------------------------------------- > > (Updated March 7, 2017, 1:22 a.m.) > > > Review request for hive, Chaozhong Yang, Alan Gates, Rui Li, Prasanth_J, Sergio Pena, Sahil Takiar, Vihang Karajgaonkar, and Xuefu Zhang. > > > Bugs: HIVE-16079 > https://issues.apache.org/jira/browse/HIVE-16079 > > > Repository: hive-git > > > Description > ------- > > When multiple concurrent Hive queries run, a separate copy of > org.apache.hadoop.hive.ql.metadata.Partition and > ql.plan.PartitionDesc is created for each table partition > per each query instance. So when in my benchmark explained in > HIVE-16079 we have 2000 partitions and 50 concurrent queries running > over them, we end up, in the worst case, with 2000*50=100,000 instances > of Partition and PartitionDesc in memory. These objects themselves > collectively take just ~2% of memory. However, other data structures > that each of them reference, take a lot more. In particular, Properties > objects take more than 20% of memory. When we have 50 concurrent > read-only queries, there are 50 identical copies of Properties per > each partition. That's a huge waste of memory. > > This change introduces a new class that extends Properties, called > CopyOnFirstWriteProperties. It utilizes a unique interned copy of > Properties whenever possible. However, when one of the methods that > modify properties is called, a copy is created. When this class is > used, memory consumption by Properties falls from 20% to 5..6%. > > > Diffs > ----- > > common/src/java/org/apache/hadoop/hive/common/CopyOnFirstWriteProperties.java PRE-CREATION > ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 247d5890ea8131404b9543d22876ca4c052578e0 > ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java d05c1c68fdb7296c0346d73967071da1ebe7bb72 > > > Diff: https://reviews.apache.org/r/57353/diff/1/ > > > Testing > ------- > > > Thanks, > > Misha Dmitriev > > --===============0605793623481637585==--