hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Fagan <mfa...@tde.com>
Subject Re: Review Request 56290: HIVE-15795: Add Accumulo Index Table Support
Date Thu, 30 Mar 2017 05:24:41 GMT


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 43 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line43>
> >
> >     A few questions:
> >     
> >     * What does "column_column" mean? Should it just be "column_name"?
> >     * "table rowid" -- does this mean the qualifier includes the base table name
in addition to the rowId?
> >     * Is the visibility component just a push-down of what was included on the data-table
mutation?

I did the default implementation to match how presto builds indexes and it "hivecolumnname_hivecolumnanme"
I can change this to another scheme if you prefer


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 48 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line48>
> >
> >     What happens if I provide an `accumulo.indextable.name` but no `accumulo.indexed.columns`?

It assumes all columns are indexed same as accumulo.indexed.columns = '*'


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 52 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line52>
> >
> >     Defaults to this class, yes? Would suggest including that in the docs.

Thanks. I will include this class as the default in the docs.


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 105 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line105>
> >
> >     What's the reasoning behind setting isolation on the scanner?

I used the isolation on scanner to prevent in-flight writes. If this is incorrect use of the
construct, I will remove it.
Based on futher testing I have removed this setting.


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 106 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line106>
> >
> >     Oh, this is what your javadoc was saying? Why is this duplicated?

This is fixed to use 'cf_cq'


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 122 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line122>
> >
> >     I'm worried about the use of `NO_MATCH` like this. There's always the possibility
of this special `Range` causing troubles due to the user-data.
> >     
> >     Is there a reason this can't just be an empty list?

This was my attempt to short-circut the scan. Its my understanding the empty list will invoke
a full scan.

I appreciate your thoughts on a better solution.


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
> > Lines 131 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624877#file1624877line131>
> >
> >     Please include a real error message (e.g. "Failed to scan index table").

understood will do


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexParameters.java
> > Lines 39 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624878#file1624878line39>
> >
> >     Looks like this is un-used?

yes unused - I will remove


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexParameters.java
> > Lines 62 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624878#file1624878line62>
> >
> >     Probably want to strip the leading/trailing whitespace on each element.

Thanks that is a great point. I will add this logic to trim each element


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexParameters.java
> > Lines 89 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624878#file1624878line89>
> >
> >     I think this should propagate an exception. The user specified something we
could not load -- this should fail-hard.

I will create a new exception class and throw it here


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexScanner.java
> > Lines 29 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624879#file1624879line29>
> >
> >     Please add javadoc for each of these methods. Since they're intended for extension,
the developer needs to know what each of these methods should do.

Will do


> On Feb. 6, 2017, 6:08 p.m., Josh Elser wrote:
> > accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexParameters.java
> > Lines 32 (patched)
> > <https://reviews.apache.org/r/56290/diff/1/?file=1624883#file1624883line32>
> >
> >     Would recommend just creating an inner-class that `implements AccumuloIndexScanner`
instead of re-using the Test class itself.

Will do


On Feb. 6, 2017, 6:08 p.m., Mike Fagan wrote:
> > This is a good start, Mike. Lots of little feedback from me :)

Thanks let me generate another patch to address all these identifed issues. I will post the
patch here first before posting it to jira. 
I would appreciate you throught on changing the index field naming scheme.


- Mike


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56290/#review164360
-----------------------------------------------------------


On March 6, 2017, 4:24 p.m., Mike Fagan wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/56290/
> -----------------------------------------------------------
> 
> (Updated March 6, 2017, 4:24 p.m.)
> 
> 
> Review request for hive and Josh Elser.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> HIVE-15795: Add Accumulo Index Table Support
> 
> 
> Diffs
> -----
> 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloDefaultIndexScanner.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexLexicoder.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexScanner.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloIndexScannerException.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/AccumuloStorageHandler.java
cdbc7f2 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexDefinition.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/AccumuloIndexedOutputFormat.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/HiveAccumuloTableOutputFormat.java
3ae5431 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/IndexOutputConfigurator.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/mr/package-info.java PRE-CREATION

>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloPredicateHandler.java
a7ec7c5 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/AccumuloRangeGenerator.java
21392d1 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/predicate/PrimitiveComparisonFilter.java
17d5529 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/AccumuloIndexParameters.java
PRE-CREATION 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/AccumuloSerDeParameters.java
09c5f24 
>   accumulo-handler/src/java/org/apache/hadoop/hive/accumulo/serde/package-info.java PRE-CREATION

>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloDefaultIndexScanner.java
PRE-CREATION 
>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexLexicoder.java
PRE-CREATION 
>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloIndexParameters.java
PRE-CREATION 
>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/TestAccumuloStorageHandler.java
0aaa782 
>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloPredicateHandler.java
88e4530 
>   accumulo-handler/src/test/org/apache/hadoop/hive/accumulo/predicate/TestAccumuloRangeGenerator.java
339da07 
>   accumulo-handler/src/test/queries/positive/accumulo_queries.q 279b661 
> 
> 
> Diff: https://reviews.apache.org/r/56290/diff/2/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Mike Fagan
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message