From dev-return-48794-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Thu Jan 25 20:55:09 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id AC99B180651 for ; Thu, 25 Jan 2018 20:55:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 9CD88160C3D; Thu, 25 Jan 2018 19:55:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id E8BAF160C50 for ; Thu, 25 Jan 2018 20:55:08 +0100 (CET) Received: (qmail 72293 invoked by uid 500); 25 Jan 2018 19:55:08 -0000 Mailing-List: contact dev-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list dev@phoenix.apache.org Received: (qmail 72280 invoked by uid 99); 25 Jan 2018 19:55:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 25 Jan 2018 19:55:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 7A55F1808DE for ; Thu, 25 Jan 2018 19:55:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -108.711 X-Spam-Level: X-Spam-Status: No, score=-108.711 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id dNL3KhcgbPWt for ; Thu, 25 Jan 2018 19:55:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 2B2FD60CDD for ; Thu, 25 Jan 2018 19:55:05 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 32DF8E2624 for ; Thu, 25 Jan 2018 19:55:04 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 0E2FA24785 for ; Thu, 25 Jan 2018 19:55:03 +0000 (UTC) Date: Thu, 25 Jan 2018 19:55:03 +0000 (UTC) From: "James Taylor (JIRA)" To: dev@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PHOENIX-4550) Ensure storage is dense when base table has many views MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-4550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Taylor updated PHOENIX-4550: ---------------------------------- Summary: Ensure storage is dense when base table has many views (was: Allow declaration of max columns on base physical table) > Ensure storage is dense when base table has many views > ------------------------------------------------------ > > Key: PHOENIX-4550 > URL: https://issues.apache.org/jira/browse/PHOENIX-4550 > Project: Phoenix > Issue Type: Improvement > Reporter: James Taylor > Priority: Major > > By declaring the max number of columns on a base table, we can optimize the storage for SINGLE_CELL_ARRAY_WITH_OFFSETS by not storing null values for the columns preceding the initial column of a view. This will make a huge difference in storage when you have a base table with many views. For example: > {code} > -- Declare that the base table will have no more than 10 columns > CREATE IMMUTABLE TABLE base (k1 VARCHAR, prefix CHAR(3) v1 DATE, > CONSTRAINT pk PRIMARY KEY (k1, prefix)) > MULTI_TENANT = true, > MAX_COLUMNS = 10; > CREATE VIEW v1(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR) > AS SELECT * FROM base WHERE prefix = 'A00'; > CREATE VIEW v2(k2 VARCHAR PRIMARY KEY, v2 VARCHAR, v3 VARCHAR); > AS SELECT * FROM base WHERE prefix = 'A10'; > ... > {code} > As the number of views grow, the difference between the base table column encoding (column #1) and the starting column number of the view (since the starting offset is determined by an incrementing value on the base table) will increase. This bloats the storage as we need to store null values for column encodings between the base table column and the starting column of the view. > Instead, we'll pass through the MAX_COLUMNS value for queries and anything column encoding less than this we know it'll be at the start. Anything greater and we'll start the search from - . -- This message was sent by Atlassian JIRA (v7.6.3#76005)