hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From j.prasant...@gmail.com
Subject Re: Review Request 24830: HIVE-7548: Precondition checks should not fail the merge task in case of automatic trigger
Date Mon, 25 Aug 2014 18:36:21 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24830/#review51419
-----------------------------------------------------------



ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
<https://reviews.apache.org/r/24830/#comment89734>

    There are other utility functions that extracts taskID/attemptID from file names. None
of these methods throw exception if it could not find matches for the regex pattern. Example:
getIdFromFilename() returns filename as Id if it cannot match pattern. I was also following
the same convention. In this case, if there are no matches for copy file pattern it will return
false and will fallback to old code path.
    
    The regex will still work if files are loaded using "LOAD DATA LOCAL INPATH" statement.
With this statement, the file names will be like
    1) filename.txt
    2) filename_copy_1.txt
    3) filename_copy_2.txt
    
    For this file pattern, there will be no match for taskId/attemptId extraction. Hence no
files will be marked duplicate. We really don't have to worry about copy file names in this
case as there will not be any duplicate elimination.



ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java
<https://reviews.apache.org/r/24830/#comment89735>

    Fixed it.


- Prasanth_J


On Aug. 19, 2014, 12:29 a.m., Prasanth_J wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24830/
> -----------------------------------------------------------
> 
> (Updated Aug. 19, 2014, 12:29 a.m.)
> 
> 
> Review request for hive and Gunther Hagleitner.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> ORC fast merge (HIVE-7509) will fail the merge task in case if any of the precondition
checks fail. Precondition check fail is good for "ALTER TABLE .. CONCATENATE" but not for
automatic trigger of merge task from conditional resolver. In case if a partition has non-compatible
ORC files for merging then the merge task should ignore it and not fail the task.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
>   ql/src/test/queries/clientnegative/orc_merge1.q b2d42cd 
>   ql/src/test/queries/clientnegative/orc_merge2.q 2f62ee7 
>   ql/src/test/queries/clientnegative/orc_merge3.q 5158e2e 
>   ql/src/test/queries/clientnegative/orc_merge4.q ad48572 
>   ql/src/test/queries/clientnegative/orc_merge5.q e94a8cc 
>   ql/src/test/queries/clientpositive/orc_merge_incompat1.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/orc_merge_incompat2.q PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_merge_incompat1.q.out PRE-CREATION 
>   ql/src/test/results/clientpositive/orc_merge_incompat2.q.out PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/24830/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Prasanth_J
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message