hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashutosh Bapat (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-21776) Replication fails to replicate a UDF with jar on HDFS during incremental
Date Fri, 24 May 2019 14:20:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-21776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashutosh Bapat updated HIVE-21776:
----------------------------------
    Description: When a UDF with jar on HDFS is replicated, we add the jar path to the dump.
The dumped URL of jar has checksum and cmroot added to it. During load, we load the jar on
target. ReplCopyTask handles the jar paths separately from the paths in _files and it uses
the presence of checksum and cmroot for that decision. (Those two are not present in _files
URL). If ReplChangeManager is not initialized during dump, dumped URL of jar does not contain
checksum and cmroot and thus ReplCopyTask fails to copy the UDF jar to the target. This fails
the repl load since the function can not be created. Fix is to initialize ReplChangeManager
always.  (was: TestReplicationScenariosAcrossInstances has test to test bootstrap of a UDF
with jar on HDFS but no test for incremental. Add the same.)

> Replication fails to replicate a UDF with jar on HDFS during incremental
> ------------------------------------------------------------------------
>
>                 Key: HIVE-21776
>                 URL: https://issues.apache.org/jira/browse/HIVE-21776
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 4.0.0
>            Reporter: Ashutosh Bapat
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-21776.01.patch, HIVE-21776.02.patch, HIVE-21776.03.patch, HIVE-21776.04.patch
>
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> When a UDF with jar on HDFS is replicated, we add the jar path to the dump. The dumped
URL of jar has checksum and cmroot added to it. During load, we load the jar on target. ReplCopyTask
handles the jar paths separately from the paths in _files and it uses the presence of checksum
and cmroot for that decision. (Those two are not present in _files URL). If ReplChangeManager
is not initialized during dump, dumped URL of jar does not contain checksum and cmroot and
thus ReplCopyTask fails to copy the UDF jar to the target. This fails the repl load since
the function can not be created. Fix is to initialize ReplChangeManager always.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message