hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Sichi <jsi...@facebook.com>
Subject Re: Incremental load from Hive into HBase?
Date Fri, 01 Oct 2010 01:45:07 GMT
Good point.  In retrospect, I guess I should have modified the grammar to support a regular
INSERT (without OVERWRITE) and require usage of that for HBase (but prohibit it for native
tables).  Probably too late for that now, so I guess we'll just say that the OVERWRITE in
the HBase case means that if keys match existing rows, those rows are overwritten (but existing
rows are not deleted as they would be with a native Hive table).

If that's OK, I'll update the wiki accordingly.

JVS

On Sep 28, 2010, at 10:50 PM, Leo Alekseyev wrote:

> On Tue, Sep 28, 2010 at 7:50 PM, Leo Alekseyev <dnquark@gmail.com> wrote:
>> I can create and load data into an HBase table as per the instructions
>> from Hive/HBase Integration wiki page using something like
>> create table ...
>> STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
>> WITH SERDEPROPERTIES ("hbase.columns.mapping" = ....);
>> 
>> Is it possible to then load more data from Hive into this table?..  I
>> keep seeing references to "bulk inserts" vs "incremental inserts" in
>> people's slides, as well as references to HBASE-1923, but no concrete
>> examples.
> 
> I will start by answering my own question: with HBaseStorageHandler,
> INSERT OVERWRITE TABLE foo ... statement appears to append rows (given
> that the row keys are unique).  Note that this is different than
> "regular" Hive tables, which would get overwritten under similar
> circumstances.  Perhaps this should be spelled out in the wiki...
> 
> This resolves the original question, but further comments on the issue
> are always welcome :)


Mime
View raw message