hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Hbase scans taking a lot of time
Date Fri, 25 Jan 2013 23:56:55 GMT
Sorry I meant scan caching. (not batching)

 From: lars hofhansl <larsh@apache.org>
To: "user@hbase.apache.org" <user@hbase.apache.org>; "dev@hbase.apache.org" <dev@hbase.apache.org>

Sent: Friday, January 25, 2013 2:00 PM
Subject: Re: Hbase scans taking a lot of time
Enable scan batching in Hive.
You're probably performing 300m RPC requests, i.e. you're mostly measuring network latency.

-- Lars

From: Vibhav Mundra <mundra@gmail.com>
To: user@hbase.apache.org; dev@hbase.apache.org 
Sent: Friday, January 25, 2013 1:10 AM
Subject: Hbase scans taking a lot of time

I am facing a very strange problem with HBase.

This what I did:
a) Create a table, using pre partioned splits.
b) Also the column familes are zipped with lzo compression.
c) Using the above configuration I am able to populate 2 million row per
min in the Hbase.
d) I have created a table with 300 million odd rows, which roughy took me 3
hours to populate and the data size is of 25GB.

e) But when I query for data the performance I am getting is very bad.
   Basically this is what I am seeing: High CPU, no disk I/O and network
I/O is happening at the rate of 6~7MB secs.

Because of this, if I scan the entries of the table using Hive it is taking
Basically it is taking around 24 hours to scan the table. Any idea, of how
to debug.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message