hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: HBase Bulk Lookup
Date Mon, 22 Sep 2014 16:30:47 GMT
Hi Bin,

100M rows 1 byte is about 100MB.
100M rows 1KB is about 100GB.

What is your record size and what is your SLA?
Do you expected 100GB to be transfered in few seconds?

How to you query your data? A single get? All of it?

You might want to give way more details of your usecases if you want more
accurate advices.

HBase is VERY good for random writes and random reads. It call also scale
(almost) to the infinite.


100MB for HBase is pretty small. So far from what I can understand on your
usecase, HBAse bulk load + HBase get/multi get is what you need, but as I
said above, need more details.

JM



2014-09-22 12:24 GMT-04:00 Bin Wang <binwang.cu@gmail.com>:

> Hi Ted,
>
> I have not dive into the programming part yet... I am still at the POC and
> pick up the right tool stage. Based on your experience, do you think the
> get(List<Get> gets) will return the result from 100M level in an
> interactive time? say a few seconds?
>
> If that is the case, I will start working on a prototype.
>
> Bin
>
> On Mon, Sep 22, 2014 at 10:00 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>
> > bq. upload a list of product ids
> >
> > Have you looked at the following API in HTable ?
> >
> >   public Result[] get(List<Get> gets) throws IOException {
> >
> > Cheers
> >
> > On Mon, Sep 22, 2014 at 8:14 AM, Bin Wang <binwang.cu@gmail.com> wrote:
> >
> > > Hi there,
> > >
> > > I have a use case that I need to do bulk look up in a table of size 100
> > > million key value pairs. Where key is the unique ID (product id), and
> the
> > > value is inventory history (time series) for that particular part.
> > >
> > > I want user upload a list of product ids, and I am wondering if HBase
> is
> > > the right tool to return the corresponding value in an interactive
> speed?
> > >
> > > If not, I heard of Solr/ElasticSearch, mongo, redis, Cassandra also,
> and
> > I
> > > am wondering which tool is the best fit in my use case.
> > >
> > > Thanks for any suggestion.
> > >
> > > Bin
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message