hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Veentjer <alarmnum...@gmail.com>
Subject Re: Using HBase in combination with HDFS directly
Date Wed, 05 Jan 2011 16:06:32 GMT
On Wed, Jan 5, 2011 at 5:00 PM, Friso van Vollenhoven <
fvanvollenhoven@xebia.com> wrote:

> I guess so.
>
> HBase actually has quite a strong consistency model.


It depends on how consistency is defined. HBase supports no repeatable reads
because there is no concept of transaction, so every time you do a read you
get a different result. For STM this would be called extremely low
consistency. There are higher levels of consistency like 'snapshot'
consistency where your reads are not only repeatable but also are causal
consistent. And then of course there is the serialized isolation level where
even writeskews are prevented.


> Thing is, that it is just row level. Multi row transactions would require
> multiple locks and some kind of commit / roll back solution. Have you had a
> look at Google's percolator paper?
>

Not yet. I'll check it our.


>
>
> Friso
>
>
>
> On 5 jan 2011, at 16:49, Peter Veentjer wrote:
>
> > I also want to see if an STM like Multiverse can be aligned with NoSQL
> > solutions like HBase. But to do that, I first need to get more hands on
> > experience with NoSQL solutions.
> >
> > On Wed, Jan 5, 2011 at 4:34 PM, Peter Veentjer <alarmnummer@gmail.com
> >wrote:
> >
> >>
> >>
> >> On Wed, Jan 5, 2011 at 4:03 PM, Friso van Vollenhoven <
> >> fvanvollenhoven@xebia.com> wrote:
> >>
> >>> Hi Peter,
> >>>
> >>> Do you mean you want to use the HDFS that HBase relies on for other
> things
> >>> and not just exclusively HBase? That should be just fine. We do it all
> the
> >>> time.
> >>>
> >>>
> >> Ok thanks.
> >>
> >>
> >>
> >>> Are you worried about putting to much load on it?
> >>
> >>
> >> For the POC it won't matter that much. I can get my stuff up and
> running.
> >>
> >>
> >>> I guess that depends on the type of work load that you have and what
> you
> >>> do with it. But generally I think it is nice to have all nodes be the
> same
> >>> (so all workers are datanode and region server), such that you don't
> have to
> >>> scale out them separately.
> >>>
> >>
> >>>> Peter, are you based in The Netherlands by any chance? There is a
> NoSQL
> >> meetup group in NL (http://www.meetup.com/nosql-nl/) with >>meetups
> every
> >> now and then. Next one is at January 24 and is all about HBase. We're
> doing
> >> a on the spot install on a number of present >>laptops to create a
> temporary
> >> cluster and play around with it. I have been working with Hadoop and
> HBase
> >> for the past couple of months, so if >>you care to come by, I'd be happy
> to
> >> share some experiences.
> >>
> >> Yet I live in Holland. I'm a former Xebia employee :) I think I'll visit
> >> one of the nosql meetups.
> >>
> >> We are building a kind of application server where instead of providing
> >> services like JMS, Servlet, EJB's etc we are providing services for
> secured
> >> document storage, message exchange, semantic analysis of documents etc.
> It
> >> is all based on GigaSpaces but I have the impression (after working more
> >> than a year with it) that is is very time consuming to get right. Apart
> from
> >> all the correctness issues (and there where/are many.. based on bad
> usage of
> >> GigaSpaces and architectural choices) there are also some
> >> performance/scalability issues that need solving.
> >>
> >> So I decided to rewrite the main use cases using HBase. I had most of
> the
> >> functionality up and running in a few days and most of the 'bad
> >> architectural choices' we are going to remove in the next 6 months are
> not
> >> there from the beginning (e.g. using streams instead of byte arrays for
> >> document processing.. how stupid can you be). It also was a nice
> exercise to
> >> play with HBase and less consistent solutions.
> >>
> >> I normally work on realizing very high consistency for Multiverse:
> >>
> >> http://multiverse.codehaus.org
> >>
> >> So I want to have some hands on experience with using less consistent
> >> solutions.
> >>
> >>
> >>>
> >>> Friso
> >>>
> >>>
> >>>
> >>> On 5 jan 2011, at 14:41, Peter Veentjer wrote:
> >>>
> >>>> Hi Guys,
> >>>>
> >>>> I'm currently writing a POC based on hbase and I spend more time on
> >>> writing
> >>>> a ui than on writing the hbase functionality. So I'm very excited
> about
> >>>> exploring HBase further and doing some serious performance and
> >>> scalability
> >>>> tests and see if we can use it as core technology instead of the
> >>>> time/resource intensive Gigaspaces.
> >>>>
> >>>> My question:
> >>>>
> >>>> I'm currently using HBase and I also want to use the HDFS directly to
> >>> store
> >>>> files. If the HBase server(s) is installed, can I directly access the
> >>> HDFS
> >>>> of these servers or is it better to set up a seperate Hadoop server
> for
> >>>> running HDFS.
> >>>
> >>>
> >>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message