hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Wade Arnold <wade.arn...@t8webware.com>
Subject Re: Use cases of HBase
Date Wed, 10 Mar 2010 05:02:33 GMT

Hbase is part of the hadoop project for a reason even if we are hdfs ugly
step child. Hive and Hbase integration is changing how we solve user UI
analytics. We use to do massive exports, analytics via map/reduce or pig,
and imports from and to hbase. Now that Hive and HBase tables can be used
together we are looking to push most of our batch analytics "online" with
simple hive queries.


On 3/9/10 7:49 PM, "Charles Woerner" <charleswoerner@gmail.com> wrote:

> As someone working in the clickstream analytics space right now, I strongly
> second this.
> On Tue, Mar 9, 2010 at 4:41 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>> Thanks for that one andrew. I think a great story is unifying both
>> analytics
>> and real time on a single platform. This makes dev and ops so much easier.
>> In fact the bigtable paper alludes to this strength. A single data platform
>> for most your needs is powerful.
>> Of course some super speciality needs might require additional platforms.
>> Eg: MySQL for highly relational data. Memcache for high read data. And so
>> on. But It is important from an architecture pov to keep distinct systems
>> count low.
>> On Mar 9, 2010 4:13 PM, "Andrew Purtell" <apurtell@apache.org> wrote:
>> I came to this discussion late.
>> Ryan and J-D's use case is clearly successful.
>> In addition to what others have said, I think another case where HBase
>> really excels is supporting analytics over Big Data (which I define as on
>> the order of petabyte). Some of the best performance numbers are put up by
>> scanners. There is tight integration with the Hadoop MapReduce framework,
>> not only in terms of API support but also with respect to efficient task
>> distribution over the cluster -- moving computation to data -- and there is
>> a favorable interaction with HDFS's location aware data placement. Moving
>> computation to data like that is one major reason how analytics using the
>> MapReduce paradigm can put conventional RDBMS/data warehouses to shame for
>> substantially less cost. Since 0.20.0, results of analytic computations
>> over
>> the data can be materialized and served out in real time in response to
>> queries. This is a complete solution.
>>  - Andy
>> ----- Original Message ----
>>> From: Ryan Rawson <ryanobjc@gmail.com>
>>> To: hbase-user@hadoop.apac...
>>> Sent: Tue, March 9, 2010 3:34:55 PM
>>> Subject: Re: Use cases of HBase
>>> HBase operates more like a write-thru cache. Recent writes are in
>>> memory (aka memstore). Older...
>>> wrote:
>>>> Ryan, your confidence has me interested in exploring HBase a bit
>> further
>> for
>>>> some r...
>>>> On Tue, Mar 9, 2010 at 2:29 PM, Ryan Rawson wrote:
>>>>> One thing to note is that 10GB is ha...
>>>>> On Tue, Mar 9, 2010 at 2:08 PM, Jonathan Gray wrote:
>>>>>> Brian,
>>>>>> I would just r...
>>>>> wrote:
>>>>>>>> This is exactly the kind of feedback I'm looking for thanks,
>>>>>>>> wrote:
>>>>>>>>>> Hi all, I've got a question about how everyone is...

View raw message