hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Nickerson <paul.nicker...@escapemg.com>
Subject Re: Using hbase for time series
Date Thu, 09 Feb 2012 22:49:19 GMT
Row key would be the following: 

(item id):(series interval day/month/week/etc):(series id #):(date code) 
so like 1000:day:3:20110504 

I pre-split the table into 50 regions before using importTsv to generate the hfiles for the
initial load, which worked fine. Table is now about 60 gigs. Scanning is fast while not writing
to the table - generally I'll scan a few hundred rows and return those values to the user.
But when map/reduce is writing to that table it takes about a minute to scan through 10 rows
even when just doing scan 'table' in the shell. 

Paul Nickerson 

Data Scientist 

Phone: 352-538-1962 
----- Original Message -----

From: "Tom" <fivemiletom@gmail.com> 
To: user@hbase.apache.org 
Cc: "Paul Nickerson" <paul.nickerson@escapemg.com> 
Sent: Thursday, February 9, 2012 5:28:45 PM 
Subject: Re: Using hbase for time series 

Hi Paul, 

generally should be possible, others are using it for TS (have a look at 
the schema @ opentsdb.net if you have not done so) . 

What does your row key schema and a typical read access look like (scan 
over many rows / multiple regions ...)? 


On 02/09/2012 02:12 PM, Paul Nickerson wrote: 
> I'm trying to create a time series table that contains a couple of billion rows. It contains
daily values for several millions of items. This table will be visible to the outside world,
so it should be able to support lots of reads at any point in time. My plan is to every night
use map/reduce to batch load the days values for each of the items into that table. The problem
seems to be that read performance is dismal while I'm writing data to the table. 
> Is there any way to accomplish what I'm trying to do? Fwiw I'm currently using the hive
hbase integration to load data to the hbase table. 
> Thank you, 
> Paul Nickerson 

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message