hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Chadwick" <mchadw...@invitemedia.com>
Subject Re: Storing/retrieving time series with hadoop
Date Thu, 08 Jan 2009 01:56:59 GMT
Brok,

I've had good luck storing time-series data with HBase.  Its latency for
looking up records is orders of magnitude lower than Hadoop's MapReduce
(which is more for batch processing), yet still resides on HDFS, and has
mechanisms to let you MapReduce on your HBase data.

You may have a difficult time getting a data warehouse to fit the model of
HBase, but if you are specificlly looking at Hadoop, that will be one of
your better bets.

-Mark Chadwick

On Wed, Jan 7, 2009 at 8:03 PM, Brock Judkins <brockjudkins@gmail.com>wrote:

> Hi list,
> I am researching hadoop as a possible solution for my company's data
> warehousing solution. My question is whether hadoop, possibly in
> combination
> with Hive or Pig, is a good solution for time-series data? We basically
> have
> a ton of web analytics to store that we display both internally and
> externally.
>
> For the time being I am storing timestamped data points in a huge MySQL
> table, but I know this will not scale very far (although it's holding up ok
> at almost 90MM rows). I am aware that hadoop can scale insanely large
> (larger than I need), but does anyone have experience using it to draw
> charts based on time series with fairly low latency?
>
> Thanks!
> Brock
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message