From issues-return-195003-archive-asf-public=cust-asf.ponee.io@hive.apache.org Mon Jul 6 15:55:03 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id CC25F18062C for ; Mon, 6 Jul 2020 17:55:02 +0200 (CEST) Received: (qmail 28627 invoked by uid 500); 6 Jul 2020 15:55:02 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 28606 invoked by uid 99); 6 Jul 2020 15:55:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jul 2020 15:55:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D456D41074 for ; Mon, 6 Jul 2020 15:55:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 2D06E7803DF for ; Mon, 6 Jul 2020 15:55:00 +0000 (UTC) Date: Mon, 6 Jul 2020 15:55:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HIVE-23671) MSCK repair should handle transactional tables in certain usecases MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-23671?focusedWorklogId=3D= 454916&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpa= nel#worklog-454916 ] ASF GitHub Bot logged work on HIVE-23671: ----------------------------------------- Author: ASF GitHub Bot Created on: 06/Jul/20 15:54 Start Date: 06/Jul/20 15:54 Worklog Time Spent: 10m=20 Work Description: pvargacl commented on a change in pull request #108= 7: URL: https://github.com/apache/hive/pull/1087#discussion_r450318861 ########## File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java ########## @@ -2392,33 +2392,29 @@ public static TableSnapshot getTableSnapshot(Config= uration conf, long writeId =3D -1; ValidWriteIdList validWriteIdList =3D null; =20 - HiveTxnManager sessionTxnMgr =3D SessionState.get().getTxnMgr(); - String fullTableName =3D getFullTableName(dbName, tblName); - if (sessionTxnMgr !=3D null && sessionTxnMgr.getCurrentTxnId() > 0) { - validWriteIdList =3D getTableValidWriteIdList(conf, fullTableName); - if (isStatsUpdater) { - writeId =3D SessionState.get().getTxnMgr() !=3D null ? - SessionState.get().getTxnMgr().getAllocatedTableWriteId( - dbName, tblName) : -1; - if (writeId < 1) { - // TODO: this is not ideal... stats updater that doesn't have wr= ite ID is currently - // "create table"; writeId would be 0/-1 here. No need to = call this w/true. - LOG.debug("Stats updater for {}.{} doesn't have a write ID ({})"= , - dbName, tblName, writeId); + if (SessionState.get() !=3D null) { + HiveTxnManager sessionTxnMgr =3D SessionState.get().getTxnMgr(); + String fullTableName =3D getFullTableName(dbName, tblName); + if (sessionTxnMgr !=3D null && sessionTxnMgr.getCurrentTxnId() > 0) = { + validWriteIdList =3D getTableValidWriteIdList(conf, fullTableName)= ; + if (isStatsUpdater) { + writeId =3D sessionTxnMgr !=3D null ? sessionTxnMgr.getAllocated= TableWriteId(dbName, tblName) : -1; Review comment: fixed ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 454916) Time Spent: 7h 40m (was: 7.5h) > MSCK repair should handle transactional tables in certain usecases > ------------------------------------------------------------------ > > Key: HIVE-23671 > URL: https://issues.apache.org/jira/browse/HIVE-23671 > Project: Hive > Issue Type: Improvement > Components: Metastore > Reporter: Peter Varga > Assignee: Peter Varga > Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > > The MSCK REPAIR tool does not handle transactional tables too well. It ca= n find and add new partitions the same way as for non-transactional tables,= but since the writeId differences are not handled, the data can not read b= ack from the new partitions. > We could handle some usecases when the writeIds in the HMS and the underl= ying data are not conflicting. If the HMS does not contains allocated write= s for the table we can seed the table with the writeIds read from the direc= tory structrure. > Real life use cases could be: > * Copy data files from one cluster to another with different HMS, create= the table and call MSCK REPAIR > * If the HMS db is lost, recreate the table and call MSCK REPAIR > =C2=A0 -- This message was sent by Atlassian Jira (v8.3.4#803005)