accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Hive/Accumulo
Date Fri, 24 Apr 2015 20:47:41 GMT
When you define a table that's backed by the AccumuloStorageHandler, you 
define a Hive column which is essentially the Accumulo rowID (":rowID" 
in the column mapping string).

You can include some filter in the WHERE clause over that Hive column. 
That portion is extracted and used to set a normal Accumulo Range.

Concretely:

CREATE TABLE my_table(uid string, name string, age int, height int)
STORED BY 'org.apache.hadoop.hive.accumulo.AccumuloStorageHandler'
WITH SERDEPROPERTIES ("accumulo.columns.mapping" =
   ":rowID,person:name,person:age,person:height");

SELECT * FROM my_table WHERE uid > "f" AND uid < "m";

In the above example, the "uid" Hive column maps to the ":rowID". The 
where clause in this query would limit the Range used on the Scanner to 
("f", "m").

Does that help?

THORMAN, ROBERT D wrote:
> Does anyone know if there is a way to limit the rowID range that Hive
> will scan on an Accumulo table? What I’m looking for is the equivalent
> of ‘scan –b <start-row> –e <end-row>’ in an HQL statement.
>
> v/r
> Bob Thorman
> Principal Big Data Engineer
> AT&T Big Data CoE
> 2900 W. Plano Parkway
> Plano, TX 75075
> 972-658-1714
>
>

Mime
View raw message