hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vibhav Mundra <mun...@gmail.com>
Subject Re: Hbase scans taking a lot of time
Date Fri, 25 Jan 2013 17:59:22 GMT
The number of column families I have is 13, which I guess is okie?

-Vibhav


On Fri, Jan 25, 2013 at 11:01 PM, Luke Lu <llu@apache.org> wrote:

> You'll have this problem if you have a large number of column families
> being scanned/populated at the same time. Make sure the data you
> scan/populate frequently are in the same column family (you can have many
> columns in a column family). Unlike BigTable/Hypertable which has the
> concept of locality/access groups, HBase always stores column families in
> separate files, essentially making column family not only a logic grouping
> mechanism but also a physical locality group.
>
>
> On Fri, Jan 25, 2013 at 1:10 AM, Vibhav Mundra <mundra@gmail.com> wrote:
>
> > I am facing a very strange problem with HBase.
> >
> > This what I did:
> > a) Create a table, using pre partioned splits.
> > b) Also the column familes are zipped with lzo compression.
> > c) Using the above configuration I am able to populate 2 million row per
> > min in the Hbase.
> > d) I have created a table with 300 million odd rows, which roughy took
> me 3
> > hours to populate and the data size is of 25GB.
> >
> > e) But when I query for data the performance I am getting is very bad.
> >    Basically this is what I am seeing: High CPU, no disk I/O and network
> > I/O is happening at the rate of 6~7MB secs.
> >
> >
> > Because of this, if I scan the entries of the table using Hive it is
> taking
> > ages.
> > Basically it is taking around 24 hours to scan the table. Any idea, of
> how
> > to debug.
> >
> >
> > -Vibhav
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message