hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Genady Gillin <gena...@exelate.com>
Subject Re: HBase random read technics
Date Thu, 22 Jan 2009 20:38:34 GMT
Hi,*

**On Thu, Jan 22, 2009 at 10:05 PM, stack <stack@duboce.net> wrote:
*
>
> *Genady Gillin wrote:
> *
>>
>> * Hi,
>>
>> We use HBase 0.19Rc2, our data(~800GB) resides in one table( is it bad?),
>> schema of table is pretty simple - it's two column families, one is keys
>> and
>> second is value, each key could have one or more values(~100).
>> *
>
> * Keys in one column family and values in another?  Why not both in the
> one column family?*


Because each key could have one or more values.


> *
>
> You use the keys in first column family to do lookups into the second?**Can
> you sort the keys and then start a scanner with perhaps start and stop keys
> being first and last from file?  Does that run faster?*


Keys that i want to read are sorted but not sequential, so scan here
useless.

> *
>
> But sounds like you need to run an MR job.  You tried that and it failed.
>  You tried on same hardware?  My guess is your were running into the issue
> we're discussing in other email ('.... slept too long...').*


Not fair to use inside info :) Hardware performance could be an issue, we're
going to upgrade hardware as result of your assistance, so I'll try to run
MR job on a new system

Thanks,
Gennady
.

> *
>
> St.Ack* *
>
>
>
> *
>>
>> * Thanks,
>> Gennady
>>
>>
>>
>> On Thu, Jan 22, 2009 at 7:46 PM, stack <stack@duboce.net> wrote:
>>
>>
>> *
>>>
>>> * Genady wrote:
>>>
>>>
>>> *
>>>>
>>>> * Hi,
>>>>
>>>>
>>>> Just wondering if somebody could recommend a random read strategy for
>>>> searching a big group of the keys(100M) in hadoop/hbase cluster, using
>>>> one
>>>> client is very slow, separating an input to smaller groups and running
>>>> each
>>>> one with a different client is certainly improves performance, but
>>>> maximum
>>>> speed I'm getting is ~3300 read/sec. I've tried to use map reduce and to
>>>> run
>>>> search as map-reduce ask and to run HBase reads from map or reduce, but
>>>> HBase is start to fail. So hardware upgrade and creating HBase in memory
>>>> tables is only direction here?
>>>>
>>>>
>>>>
>>>>
>>>> *
>>>
>>> * Tell us more about your table schema, data sizes, and the types of
>>> query.
>>>  What performance do you need from hbase?  Do your rows have many columns
>>> and you are trying to get all columns when you query for example?  Are
>>> you
>>> on 0.19.0 Genady (sorry if you've answered this question in the near
>>> past)?
>>> St.Ack
>>>
>>>
>>> *
>>
>> *
>>
>> *
>
> *
> *
>
*
*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message