hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Ortiz <mlor...@uci.cu>
Subject Re: Timeseries data
Date Wed, 29 Aug 2012 00:33:38 GMT
Study the OpenTSDB at StumbleUpon described  by Benoit "tsuna" Sigoure 
(tsuna@stumbleupon.com) in the
HBaseCon talk called "Lessons Learned from OpenTSDB".
His team have done a great job working with Time-series data, and he 
gave a lot of great advices to work with this kind of data with HBase:
- Wider rows to seek faster
- Use asynchbase + Netty or Finagle(great tool created by Twitter 
engineers to work with HBase) = performance ++
- Make writes idempotent and independent
    before: start rows at arbitrary points in time
    after: align rows on 10m (then 1h) boundaries
- Store more data per Key/Value
- Compact your data
- Use short family names
Best wishes
El 28/08/2012 20:21, Mohit Anchlia escribió:
> In timeseries type data how do people deal with scenarios where one might
> get multiple events in a millisecond? Using nano second approach seems
> tricky. Other option is to take advantage of versions or counters.
>
>
> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION
>
> http://www.uci.cu
> http://www.facebook.com/universidad.uci
> http://www.flickr.com/photos/universidad_uci



10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS CIENCIAS INFORMATICAS...
CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION

http://www.uci.cu
http://www.facebook.com/universidad.uci
http://www.flickr.com/photos/universidad_uci

Mime
View raw message