From issues-return-3135-archive-asf-public=cust-asf.ponee.io@phoenix.apache.org Mon Dec 3 22:54:06 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9B92A180645 for ; Mon, 3 Dec 2018 22:54:05 +0100 (CET) Received: (qmail 3310 invoked by uid 500); 3 Dec 2018 21:54:04 -0000 Mailing-List: contact issues-help@phoenix.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@phoenix.apache.org Delivered-To: mailing list issues@phoenix.apache.org Received: (qmail 3301 invoked by uid 99); 3 Dec 2018 21:54:04 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Dec 2018 21:54:04 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 55D59CA155 for ; Mon, 3 Dec 2018 21:54:04 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -110.301 X-Spam-Level: X-Spam-Status: No, score=-110.301 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id WM047tJai3CJ for ; Mon, 3 Dec 2018 21:54:02 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 619BE6110E for ; Mon, 3 Dec 2018 21:54:01 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 83D3EE0E3E for ; Mon, 3 Dec 2018 21:54:00 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 19A4024DD5 for ; Mon, 3 Dec 2018 21:54:00 +0000 (UTC) Date: Mon, 3 Dec 2018 21:54:00 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@phoenix.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (PHOENIX-5025) Tool to clean up orphan views MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/PHOENIX-5025?page=3Dcom.atlassi= an.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D16= 707862#comment-16707862 ]=20 ASF GitHub Bot commented on PHOENIX-5025: ----------------------------------------- Github user kadirozde commented on a diff in the pull request: https://github.com/apache/phoenix/pull/404#discussion_r238451983 =20 --- Diff: phoenix-core/src/main/java/org/apache/phoenix/mapreduce/Orpha= nViewTool.java --- @@ -0,0 +1,812 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or imp= lied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.phoenix.mapreduce; + +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.COLUMN_F= AMILY; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.COLUMN_N= AME; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.LINK_TYP= E; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.SYSTEM_C= ATALOG_NAME; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.SYSTEM_C= HILD_LINK_NAME; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.TABLE_NA= ME; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.TABLE_SC= HEM; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.TABLE_TY= PE; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.TENANT_I= D; +import static org.apache.phoenix.jdbc.PhoenixDatabaseMetaData.VIEW_TYP= E; + +import java.io.BufferedReader; +import java.io.BufferedWriter; +import java.io.FileReader; +import java.io.FileWriter; +import java.io.IOException; +import java.sql.Connection; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.Iterator; +import java.util.LinkedList; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.Properties; + +import org.apache.commons.cli.CommandLine; +import org.apache.commons.cli.CommandLineParser; +import org.apache.commons.cli.HelpFormatter; +import org.apache.commons.cli.Option; +import org.apache.commons.cli.Options; +import org.apache.commons.cli.ParseException; +import org.apache.commons.cli.PosixParser; +import org.apache.commons.lang.exception.ExceptionUtils; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.hbase.HBaseConfiguration; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; +import org.apache.phoenix.jdbc.PhoenixConnection; +import org.apache.phoenix.mapreduce.util.ConnectionUtil; +import org.apache.phoenix.parse.DropTableStatement; +import org.apache.phoenix.query.QueryServices; +import org.apache.phoenix.query.QueryServicesOptions; +import org.apache.phoenix.schema.MetaDataClient; +import org.apache.phoenix.schema.PTable; +import org.apache.phoenix.schema.PTableType; +import org.apache.phoenix.schema.TableNotFoundException; +import org.apache.phoenix.util.PhoenixRuntime; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * A tool to identify orphan views and links, and drop them + * + */ +public class OrphanViewTool extends Configured implements Tool { + private static final Logger LOG =3D LoggerFactory.getLogger(Orphan= ViewTool.class); + // Query all the views that are not "MAPPED" views + private static final String viewQuery =3D "SELECT " + + TENANT_ID + ", " + + TABLE_SCHEM + "," + + TABLE_NAME + + " FROM " + SYSTEM_CATALOG_NAME + + " WHERE "+ TABLE_TYPE + " =3D '" + PTableType.VIEW.getSeri= alizedValue() +"' AND NOT " + + VIEW_TYPE + " =3D " + PTable.ViewType.MAPPED.getSerialized= Value(); + // Query all physical links + private static final String physicalLinkQuery =3D "SELECT " + + TENANT_ID + ", " + + TABLE_SCHEM + ", " + + TABLE_NAME + ", " + + COLUMN_NAME + " AS PHYSICAL_TABLE_TENANT_ID, " + + COLUMN_FAMILY + " AS PHYSICAL_TABLE_FULL_NAME " + + " FROM " + SYSTEM_CATALOG_NAME + + " WHERE "+ LINK_TYPE + " =3D " + + PTable.LinkType.PHYSICAL_TABLE.getSerializedValue(); + // Query all child-parent links + private static final String childParentLinkQuery =3D "SELECT " + + TENANT_ID + ", " + + TABLE_SCHEM + ", " + + TABLE_NAME + ", " + + COLUMN_NAME + " AS PARENT_VIEW_TENANT_ID, " + + COLUMN_FAMILY + " AS PARENT_VIEW_FULL_NAME " + + " FROM " + SYSTEM_CATALOG_NAME + + " WHERE "+ LINK_TYPE + " =3D " + + PTable.LinkType.PARENT_TABLE.getSerializedValue(); + // Query all parent-child links + private static final String parentChildLinkQuery =3D "SELECT " + + TENANT_ID + ", " + + TABLE_SCHEM + ", " + + TABLE_NAME + ", " + + COLUMN_NAME + " AS CHILD_VIEW_TENANT_ID, " + + COLUMN_FAMILY + " AS CHILD_VIEW_FULL_NAME " + + " FROM " + SYSTEM_CHILD_LINK_NAME + + " WHERE "+ LINK_TYPE + " =3D " + + PTable.LinkType.CHILD_TABLE.getSerializedValue(); + + // Query all the tables that can be a base table + private static final String candidateBaseTableQuery =3D "SELECT " = + + TENANT_ID + ", " + + TABLE_SCHEM + ", " + + TABLE_NAME + + " FROM " + SYSTEM_CATALOG_NAME + + " WHERE "+ " NOT " + TABLE_TYPE + " =3D '" + PTableType.VI= EW.getSerializedValue() + "'"; + + private String outputPath; + private String inputPath; + private boolean clean =3D false; + private int maxViewLevel =3D 0; + private static final long defaultAge =3D 24*60*60*1000; // 1 day + private long age =3D 0; + + private static final byte VIEW =3D 0; + private static final byte PHYSICAL_TABLE_LINK =3D 1; + private static final byte PARENT_TABLE_LINK =3D 2; + private static final byte CHILD_TABLE_LINK =3D 3; --- End diff -- =20 I initially tied to use enum and noticed that it would ended up with so= mewhat unnecessarily involved code as Java enum is not really suitable for = this case where I need C/C++ like enum.=20 > Tool to clean up orphan views > ----------------------------- > > Key: PHOENIX-5025 > URL: https://issues.apache.org/jira/browse/PHOENIX-5025 > Project: Phoenix > Issue Type: New Feature > Reporter: Kadir OZDEMIR > Assignee: Kadir OZDEMIR > Priority: Major > > A view without its base table is an orphan view. Since views are virtual = tables and their data is stored in their base tables, they are useless when= they become orphan. A base table can have child views, grandchild views an= d so on. Due to some reasons/bugs, when a base table was dropped, its views= were not not properly cleaned up in the past. For example, the drop table = code did not support cleaning up grandchild views. This has been recently f= ixed by PHOENIX-4764.=C2=A0Although=C2=A0PHOENIX-4764 prevents new orphan v= iews=C2=A0due to table drop operations, it does not clean up existing orpha= n views. It is also believed that when the system catalog table was split d= ue to a bug in the past, it also contributed to creating orphan views as Ph= oenix did not support splittable system catalog. Therefore,=C2=A0Phoenix ne= eds a tool to clean up orphan views. -- This message was sent by Atlassian JIRA (v7.6.3#76005)