hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dai <dai...@gmail.com>
Subject Re: Review Request 62360: HIVE-16898: Validation of source file after distcp in repl load
Date Mon, 18 Sep 2017 22:55:15 GMT


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 73 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line73>
> >
> >     Evaluation of doing a regularCopy or distCp can be done in the inner most function
call, this will reduce passing in another variable from the top which can be evaluated later

I need to cache useRegularCopy and pass it to multiple doCopyRetry, that's why I put in the
outer function.


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 92 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line92>
> >
> >     I think eventually we have to move to a model of doing the checksum on sourceFS
vs destinationFS as you have done here, though certain FS configurations change the value
of checksum and unless we can guarantee that we calculate the checksum on the data by reading
the data this might lead to more failures,
> >     
> >     I thought the idea for now was that,
> >     
> >     1>> we get the checksum of the file on sourceFS before copy
> >     2>> we do the copy
> >     3>> we get the checksum on the file on sourceFS again 
> >     4>> we compare the checksum in 1 and 3 and if its not changed then during
our copy the value wouldnt have either. 
> >     
> >     until we can figure out the acutal solution to this, the fall back of doing
the check on sourceFS might be the way to go.

Yes, that's right. The checksum of the file is in _files.


> On Sept. 18, 2017, 4:49 a.m., anishek wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java
> > Lines 116 (patched)
> > <https://reviews.apache.org/r/62360/diff/1/?file=1828081#file1828081line116>
> >
> >     As a part of doing copy if the copy fails due to fileNotFoundException for a
file location to actual location on hdfs then we should retry with the corresponding CMRoot
Path for this file since it was moved while we were in the porcess of doing the copy.
> >     
> >     Also if this happnes for a CM root file then there is an issue in our configuration
such that the CM root FS is cleaned before the copy is done and we should log this as an error
as the cleaner thread for CMroot is not configured for the right time. i did rather fail repl
load, instead of just logging the error else we might not know how many such instances might
happen before we realize that replication is broken.

Retry with CM path is part of copyAndVerify. doCopyRetry is shared between regular import
and repl load, it does not deal with CM logic.


- Daniel


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/62360/#review185534
-----------------------------------------------------------


On Sept. 15, 2017, 6:10 p.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/62360/
> -----------------------------------------------------------
> 
> (Updated Sept. 15, 2017, 6:10 p.m.)
> 
> 
> Review request for hive.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> See HIVE-16898
> 
> 
> Diffs
> -----
> 
>   metastore/src/java/org/apache/hadoop/hive/metastore/ReplChangeManager.java 88d6a7a

>   ql/src/java/org/apache/hadoop/hive/ql/exec/ReplCopyTask.java 54746d3 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/repl/CopyUtils.java 28e7bcb 
> 
> 
> Diff: https://reviews.apache.org/r/62360/diff/1/
> 
> 
> Testing
> -------
> 
> Manually test it with debugger: setup a breakpoint right before copy, and drop table
in another session.
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message