hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Satish Mittal" <satish.mit...@apache.org>
Subject Re: Review Request 16951: HIVE-6109: Support customized location for EXTERNAL tables created by Dynamic Partitioning
Date Mon, 27 Jan 2014 09:04:04 GMT


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > Hi, Sorry for the delay, I thought I'd published this review over the weekend, but
reviewboard was unresponsive, and it looks like it didn't save.

Thanks Sushanth for the review.


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java,
line 73
> > <https://reviews.apache.org/r/16951/diff/1/?file=424591#file424591line73>
> >
> >     "_DYN" is already defined in FosterStorageHandler, needs to have one place where
it's defined. I'm okay with it being defined here if the FosterStorageHandler constant is
removed and references to that are changed to reference this.

Will refactor it now. I didn't want to touch unrelated code in the first cut.


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java,
line 272
> > <https://reviews.apache.org/r/16951/diff/1/?file=424591#file424591line272>
> >
> >     whitespace errors - git refers to a bunch of these through the patch when we
try to apply, please correct for final patch upload.

Will do.


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java,
line 721
> > <https://reviews.apache.org/r/16951/diff/1/?file=424591#file424591line721>
> >
> >     A bit about code readability - if we add a special case, then it makes sense
to add the special case as an else, rather than as an if - that way, the default behaviour
is visible first, and then the special case - please swap this around so that this is a if
(!customDynamicLocationUsed) structure.

Fine, will do that.


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java,
line 765
> > <https://reviews.apache.org/r/16951/diff/1/?file=424591#file424591line765>
> >
> >     This is now significant amount of code repetition from line 720-741 above, please
see if we can refactor this into a separate method.

I will see if it can be easily refactored into a private method.


> On Jan. 22, 2014, 10:25 p.m., Sushanth Sowmyan wrote:
> > hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputJobInfo.java,
line 178
> > <https://reviews.apache.org/r/16951/diff/1/?file=424594#file424594line178>
> >
> >     This becomes the primary API point with this change, wherein, a user that is
using HCatOutputFormat will generate an OutputJobInfo, and then call setCustomDynamicLocation
on it. This is fine for M/R users of HCat, but is something that will wind up having to be
implemented for each M/R user. It might have been better to define a constant in HCatConstants,
say "hcat.dynamic.partitioning.custom.pattern", and to use that as a JobInfo parameter. That
makes it easier for other tools to integrate with this feature. For example, with your patch,
we still do not support the ability for the HCatStorer from pig to be able to write to custom
dynamic partitions, while we do want to keep feature parity where possible between HCatOutputFormat
and HCatStorer.
> >     
> >     In fact, as a design goal for HCat, we're trying to move away from letting(requiring)
users explicitly muck around with OutputJobInfo and InputJobInfo, and stick to static calls
to HCatInputFormat/HCatOutputFormat.
> >     
> >     I would like to see this call be something the HCatOutputFormat automatically
calls if a jobConf parameter(as above) is set. That way, we can solve pig compatibility as
well easily.
> >

Thanks for this feedback Sushanth. I had realized that with the current patch, HCatStorer
is still not covered (only M/R jobs are covered), and was thinking of exposing equivalent
APIs in HCatStorer to achieve it. However your suggestion sounds cleaner to me. Will do that!


- Satish


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16951/#review32561
-----------------------------------------------------------


On Jan. 16, 2014, 12:09 p.m., Satish Mittal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/16951/
> -----------------------------------------------------------
> 
> (Updated Jan. 16, 2014, 12:09 p.m.)
> 
> 
> Review request for hive and Sushanth Sowmyan.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> - Attaching the patch that implements the functionality to support custom location for
external tables in dynamic partitioning.
> 
> 
> Diffs
> -----
> 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java
a5ae1be 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/FosterStorageHandler.java
288b7a3 
>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/HCatFileUtil.java PRE-CREATION

>   hcatalog/core/src/main/java/org/apache/hive/hcatalog/mapreduce/OutputJobInfo.java b63bdc2

>   hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/HCatMapReduceTest.java
77bdb9d 
>   hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatDynamicPartitioned.java
d8b69c2 
>   hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatExternalDynamicPartitioned.java
36c7945 
> 
> Diff: https://reviews.apache.org/r/16951/diff/
> 
> 
> Testing
> -------
> 
> - Added unit test.
> - Tested the functionality through a sample MR program that uses HCatOutputFormat interface
configured with the new custom dynamic location.
> 
> 
> Thanks,
> 
> Satish Mittal
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message