hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Friso van Vollenhoven <fvanvollenho...@xebia.com>
Subject Re: Using HBase in combination with HDFS directly
Date Wed, 05 Jan 2011 16:00:57 GMT
I guess so.

HBase actually has quite a strong consistency model. Thing is, that it is just row level.
Multi row transactions would require multiple locks and some kind of commit / roll back solution.
Have you had a look at Google's percolator paper?


Friso



On 5 jan 2011, at 16:49, Peter Veentjer wrote:

> I also want to see if an STM like Multiverse can be aligned with NoSQL
> solutions like HBase. But to do that, I first need to get more hands on
> experience with NoSQL solutions.
> 
> On Wed, Jan 5, 2011 at 4:34 PM, Peter Veentjer <alarmnummer@gmail.com>wrote:
> 
>> 
>> 
>> On Wed, Jan 5, 2011 at 4:03 PM, Friso van Vollenhoven <
>> fvanvollenhoven@xebia.com> wrote:
>> 
>>> Hi Peter,
>>> 
>>> Do you mean you want to use the HDFS that HBase relies on for other things
>>> and not just exclusively HBase? That should be just fine. We do it all the
>>> time.
>>> 
>>> 
>> Ok thanks.
>> 
>> 
>> 
>>> Are you worried about putting to much load on it?
>> 
>> 
>> For the POC it won't matter that much. I can get my stuff up and running.
>> 
>> 
>>> I guess that depends on the type of work load that you have and what you
>>> do with it. But generally I think it is nice to have all nodes be the same
>>> (so all workers are datanode and region server), such that you don't have to
>>> scale out them separately.
>>> 
>> 
>>>> Peter, are you based in The Netherlands by any chance? There is a NoSQL
>> meetup group in NL (http://www.meetup.com/nosql-nl/) with >>meetups every
>> now and then. Next one is at January 24 and is all about HBase. We're doing
>> a on the spot install on a number of present >>laptops to create a temporary
>> cluster and play around with it. I have been working with Hadoop and HBase
>> for the past couple of months, so if >>you care to come by, I'd be happy to
>> share some experiences.
>> 
>> Yet I live in Holland. I'm a former Xebia employee :) I think I'll visit
>> one of the nosql meetups.
>> 
>> We are building a kind of application server where instead of providing
>> services like JMS, Servlet, EJB's etc we are providing services for secured
>> document storage, message exchange, semantic analysis of documents etc. It
>> is all based on GigaSpaces but I have the impression (after working more
>> than a year with it) that is is very time consuming to get right. Apart from
>> all the correctness issues (and there where/are many.. based on bad usage of
>> GigaSpaces and architectural choices) there are also some
>> performance/scalability issues that need solving.
>> 
>> So I decided to rewrite the main use cases using HBase. I had most of the
>> functionality up and running in a few days and most of the 'bad
>> architectural choices' we are going to remove in the next 6 months are not
>> there from the beginning (e.g. using streams instead of byte arrays for
>> document processing.. how stupid can you be). It also was a nice exercise to
>> play with HBase and less consistent solutions.
>> 
>> I normally work on realizing very high consistency for Multiverse:
>> 
>> http://multiverse.codehaus.org
>> 
>> So I want to have some hands on experience with using less consistent
>> solutions.
>> 
>> 
>>> 
>>> Friso
>>> 
>>> 
>>> 
>>> On 5 jan 2011, at 14:41, Peter Veentjer wrote:
>>> 
>>>> Hi Guys,
>>>> 
>>>> I'm currently writing a POC based on hbase and I spend more time on
>>> writing
>>>> a ui than on writing the hbase functionality. So I'm very excited about
>>>> exploring HBase further and doing some serious performance and
>>> scalability
>>>> tests and see if we can use it as core technology instead of the
>>>> time/resource intensive Gigaspaces.
>>>> 
>>>> My question:
>>>> 
>>>> I'm currently using HBase and I also want to use the HDFS directly to
>>> store
>>>> files. If the HBase server(s) is installed, can I directly access the
>>> HDFS
>>>> of these servers or is it better to set up a seperate Hadoop server for
>>>> running HDFS.
>>> 
>>> 
>> 


Mime
View raw message