accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bill Havanki" <bhava...@clouderagovt.com>
Subject Re: Review Request 15752: ACCUMULO-1854 Persist AccumuloInputFormat information from Configuration into RangeInputSplit
Date Thu, 21 Nov 2013 21:31:55 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/15752/#review29247
-----------------------------------------------------------



src/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java
<https://reviews.apache.org/r/15752/#comment56386>

    Thie method no longer needs the conf parameter. And therefore, the method above it taking
a TaskAttemptContext parameter doesn't need that either.



src/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java
<https://reviews.apache.org/r/15752/#comment56387>

    Thie method no longer needs the conf parameter. And therefore, the method above it taking
a TaskAttemptContext parameter doesn't need that either.


- Bill Havanki


On Nov. 21, 2013, 12:31 a.m., Josh Elser wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/15752/
> -----------------------------------------------------------
> 
> (Updated Nov. 21, 2013, 12:31 a.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-1854
>     https://issues.apache.org/jira/browse/ACCUMULO-1854
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> The current way that AccumuloInputFormat works requires that the same *exact* Configuration
that was used to invoke getSplits() is also provided when createRecordReader() is called on
the InputFormat. In practice, notably looking at InputFormat implementations which merge or
delegate other InputFormats, this is a bad idea.
> 
> By serializing the necessary information into the RangeInputSplit from the provided Configuration
object in getSplits() we can completely avoid this problem, at the minimal expense of serialization
this information into each InputSplit. I tried to implement the changes in such a way that
would be backwards compatible. If the information is not provided (is null) in the RangeInputSplit,
the RecordReader will still attempt to pull a value from the Configuration object so as to
not fail immediately. This should provide a little more flexibility if users have custom code
built on top of the AccumuloInputFormat and RangeInputSplit
> 
> 
> Diffs
> -----
> 
>   src/core/src/main/java/org/apache/accumulo/core/client/mapreduce/AccumuloInputFormat.java
4de131f 
>   src/core/src/main/java/org/apache/accumulo/core/client/mapreduce/InputFormatBase.java
8e238f1 
>   src/core/src/main/java/org/apache/accumulo/core/client/mapreduce/RangeInputSplit.java
PRE-CREATION 
>   src/core/src/test/java/org/apache/accumulo/core/client/mapreduce/AccumuloInputFormatTest.java
ba647e9 
>   src/core/src/test/java/org/apache/accumulo/core/client/mapreduce/AccumuloRowInputFormatTest.java
0673f1b 
>   src/core/src/test/java/org/apache/accumulo/core/client/mapreduce/RangeInputSplitTest.java
PRE-CREATION 
>   src/examples/simple/src/test/java/org/apache/accumulo/examples/simple/filedata/ChunkInputFormatTest.java
c31c738 
> 
> Diff: https://reviews.apache.org/r/15752/diff/
> 
> 
> Testing
> -------
> 
> Verified changes work as intended using PigInputFormat (which may delegate to many InputFormats).
Added additional unit tests and verified sufficient coverage using cobertura.
> 
> 
> Thanks,
> 
> Josh Elser
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message