From issues-return-158460-archive-asf-public=cust-asf.ponee.io@hive.apache.org Fri May 24 14:20:09 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id DC696180671 for ; Fri, 24 May 2019 16:20:08 +0200 (CEST) Received: (qmail 75022 invoked by uid 500); 24 May 2019 14:20:08 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 74995 invoked by uid 99); 24 May 2019 14:20:08 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 24 May 2019 14:20:08 +0000 Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id AFE85E2C88 for ; Fri, 24 May 2019 14:20:03 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 8D45F25819 for ; Fri, 24 May 2019 14:20:00 +0000 (UTC) Date: Fri, 24 May 2019 14:20:00 +0000 (UTC) From: "Ashutosh Bapat (JIRA)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HIVE-21776) Replication fails to replicate a UDF with jar on HDFS during incremental MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-21776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Bapat updated HIVE-21776: ---------------------------------- Description: When a UDF with jar on HDFS is replicated, we add the jar path to the dump. The dumped URL of jar has checksum and cmroot added to it. During load, we load the jar on target. ReplCopyTask handles the jar paths separately from the paths in _files and it uses the presence of checksum and cmroot for that decision. (Those two are not present in _files URL). If ReplChangeManager is not initialized during dump, dumped URL of jar does not contain checksum and cmroot and thus ReplCopyTask fails to copy the UDF jar to the target. This fails the repl load since the function can not be created. Fix is to initialize ReplChangeManager always. (was: TestReplicationScenariosAcrossInstances has test to test bootstrap of a UDF with jar on HDFS but no test for incremental. Add the same.) > Replication fails to replicate a UDF with jar on HDFS during incremental > ------------------------------------------------------------------------ > > Key: HIVE-21776 > URL: https://issues.apache.org/jira/browse/HIVE-21776 > Project: Hive > Issue Type: Bug > Affects Versions: 4.0.0 > Reporter: Ashutosh Bapat > Assignee: Ashutosh Bapat > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21776.01.patch, HIVE-21776.02.patch, HIVE-21776.03.patch, HIVE-21776.04.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When a UDF with jar on HDFS is replicated, we add the jar path to the dump. The dumped URL of jar has checksum and cmroot added to it. During load, we load the jar on target. ReplCopyTask handles the jar paths separately from the paths in _files and it uses the presence of checksum and cmroot for that decision. (Those two are not present in _files URL). If ReplChangeManager is not initialized during dump, dumped URL of jar does not contain checksum and cmroot and thus ReplCopyTask fails to copy the UDF jar to the target. This fails the repl load since the function can not be created. Fix is to initialize ReplChangeManager always. -- This message was sent by Atlassian JIRA (v7.6.3#76005)