hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicolas Liochon <nkey...@gmail.com>
Subject Re: Hbase import Tsv performance (slow import)
Date Tue, 23 Oct 2012 16:46:52 GMT
Hi,

The schema design is important. There is this entry to look at at least:
http://hbase.apache.org/book.html#rowkey.design
For the config, could you pastebin the hdfs & hbase config files you used?

N.

On Tue, Oct 23, 2012 at 5:48 PM, Nick maillard <
nicolas.maillard@fifty-five.com> wrote:

> Hi everyone
>
> I'm starting with hbase and testing for our needs. I have set up a hadoop
> cluster of Three machines and A Hbase cluster atop on the same three
> machines,
> one master two slaves.
>
> I am testing the Import of a 5GB csv file with the importTsv tool. I
> import the
> file in the HDFS and use the importTsv tool to import in Hbase.
>
> Right now it takes a little over an hour to complete. It creates around 2
> million entries in one table with a single family.
> If I use bulk uploading it goes down to 20 minutes.
>
> My hadoop has 21 map tasks but they all seem to be taking a very long time
> to
> finish many tasks end up in time out.
>
> I am wondering what I have missed in my configuration. I have followed the
> different prerequisites in the documentations but I am really unsure as to
> what
> is causing this slow down. If I were to apply the wordcount example to the
> same
> file it takes only minutes to complete so I am guessing the issue lies in
> my
> Hbase configuration.
>
> Any help or pointers would by appreciated
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message