hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Blake Lemoine <bal2...@gmail.com>
Subject Re: Mongo vs HBase
Date Thu, 11 Aug 2011 03:07:02 GMT
I'm just curious here.  I'm working on a google summer of code project
currently that utilizes HBase and several times now I've made secondary
indices based on what I think are standard practices.  Is there any
principled reason that this process couldn't be automated or is it just that
no one has implemented it yet?

On Wed, Aug 10, 2011 at 7:57 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:

> Mongodb does an excellent job at single node scalability - they use
> mmap and many smart things and really kick ass ... ON A SINGLE NODE.
>
> That single node must have raid (raid it going out of fashion btw),
> and you wont be able to scale without resorting to:
> - replication (complex setup!)
> - sharding
>
> mongo claims to help on the last item, but it is still a risk point.
>
> For really large data that must span multiple machines, there is no
> "clustered sql" type solution that isnt (a) borked in various ways
> (Oracle RAC I'm looking at you) or (b) stupid expensive (Oracle RAC,
> STILL looking at you)
>
> Tools like HBase give you scalability at the cost of features (no
> automated secondary indexing, no query language).
>
> Welcome... to... big... data.
>
> -ryan
>
> On Thu, Aug 11, 2011 at 12:44 AM, Edward Capriolo <edlinuxguru@gmail.com>
> wrote:
> > On Wed, Aug 10, 2011 at 4:26 PM, Li Pi <li@cloudera.com> wrote:
> >
> >> You'll have to build your own secondary indexes for now.
> >>
> >> On Wed, Aug 10, 2011 at 1:15 PM, Laurent Hatier <
> laurent.hatier@gmail.com
> >> >wrote:
> >>
> >> > Yes, i have heard this index but is it available on hbase 0.90.3 ?
> >> >
> >> > 2011/8/10 Chris Tarnas <cft@email.com>
> >> >
> >> > > Hi Laurent,
> >> > >
> >> > > Without more details on your schema and how you are finding that
> number
> >> > in
> >> > > your table it is impossible to fully answer the question. I suspect
> >> what
> >> > you
> >> > > are seeing is mongo's native support for secondary indexes. If you
> were
> >> > to
> >> > > add secondary indexes in HBase then retrieving that row should be
on
> >> the
> >> > > order of 3-30ms. If that is you main query method then you could
> >> > reorganize
> >> > > your table to make that long number your row key, then you would get
> >> even
> >> > > faster reads.
> >> > >
> >> > > -chris
> >> > >
> >> > >
> >> > > On Aug 10, 2011, at 10:02 AM, Laurent Hatier wrote:
> >> > >
> >> > > > Hi all,
> >> > > >
> >> > > > I would like to know why MongoDB is faster than HBase to select
> >> items.
> >> > > > I explain my case :
> >> > > > I've inserted 4'000'000 lines into HBase and MongoDB and i must
> >> > calculate
> >> > > > the geolocation with the IP. I calculate a Long number with the
IP
> >> and
> >> > i
> >> > > go
> >> > > > to find it into the 4'000'000 lines.
> >> > > > it's take 5 ms to select the right row with Mongo instead of
HBase
> >> > takes
> >> > > 5
> >> > > > seconds.
> >> > > > I think that the reason is the method : cur.limit(1) with MongoDB
> but
> >> > is
> >> > > > there no function like this with HBase ?
> >> > > >
> >> > > > --
> >> > > > Laurent HATIER
> >> > > > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >> > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Laurent HATIER
> >> > Étudiant en 2e année du Cycle Ingénieur à l'EISTI
> >> >
> >>
> >
> > http://www.xtranormal.com/watch/6995033/mongo-db-is-web-scale
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message