hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From hmch...@tsmc.com
Subject Re: Bulk load + Secondary index
Date Wed, 15 Jun 2011 00:16:17 GMT


Hi,

I  am trying to speed up the data initial into HBase.
By spooling oracle data to csv file -> importtsv (Map/Reduce to HFile) ->
completebulkload -> build secondary index

Spool oracle data, one table with 7 millions records, taken time: 4 hours
importtsv in 5 machines cluster, taken time: 20 minutes
completebulkload , taken time: 5 seconds
build secondary index: N/A

If our production database (oracle) downtime is only a few hours, it seems
hard to spool its data  in a short time.
There are also a lot of table with no update time that we cannot load data
continuously  by that time column.
Any suggestions?

BTW, I found the root cause why my indexed table is empty.
By passing conf to HTable's constructor, it works fine. Thank God!

hbase-transactional-tableindexed /
IndexedTableAdmin.java
 private void reIndexTable(final byte[] baseTableName, final
IndexSpecification indexSpec) throws IOException {
        HTable baseTable = new HTable(this.conf, baseTableName);
        HTable indexTable = new HTable(this.conf
,indexSpec.getIndexedTableName(baseTableName));

Fleming Chiu(邱宏明)
Ext: 707-2260
Be Veg, Go Green, Save the Planet!


                                                                                         
                                                
             saint.ack@gmail.com                                                         
                                                
                                                                                         
                                                
                                                                                         
                                             To 
             Sent by:                          user@hbase.apache.org                     
                                                
             saint.ack@gmail.com                                                         
                                             cc 
                                                                                         
                                                
                                                                                         
                                        Subject 
             2011/06/15 上午 02:55             Re: Bulk load + Secondary index         
                                                  
                                                                                         
                                                
                                                                                         
                                                
                 Please respond to                                                       
                                                
               user@hbase.apache.org                                                     
                                                
                                                                                         
                                                
                                                                                         
                                                




2011/6/14  <hmchiud@tsmc.com>:
> I am trying to dump my oracle data to hbase by bulk load.
> After that, build my index by hbase-transactional-tableindexed.

Perhap bulk loading is by-passing tableindexed's means of populating
the secondary index?


> Therefore, I wonder they are not the good combination.

What are you trying to achieve?


> Be Veg, Go Green, Save the Planet!

I will.


St.Ack


 --------------------------------------------------------------------------- 
                                                         TSMC PROPERTY       
 This email communication (and any attachments) is proprietary information   
 for the sole use of its                                                     
 intended recipient. Any unauthorized review, use or distribution by anyone  
 other than the intended                                                     
 recipient is strictly prohibited.  If you are not the intended recipient,   
 please notify the sender by                                                 
 replying to this email, and then delete this email and any copies of it     
 immediately. Thank you.                                                     
 --------------------------------------------------------------------------- 




Mime
View raw message