hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: HBase - bulk loading files
Date Fri, 19 Dec 2014 21:50:30 GMT
Can you let us know the HBase and hadoop versions you're using ?

Were the clusters taking load from other sources when ImportTsv was running
?

Cheers

On Fri, Dec 19, 2014 at 1:43 PM, Rama Ramani <rama.ramani@live.com> wrote:

> Hello,         I am bulk loading a set of files (about 400MB each) with
> "|" as the delimiter using ImportTsv. It takes a long time for the 'map'
> job to complete on both a 4 node and a 16 node cluster. I tried the option
> to generate the output (providing -Dimporttsv.bulk.output) which took time
> indicating that the generation of the output files needs improvement.
> I am seeing about 8000 rows / sec for this dataset, the 400MB ingestion
> takes about 5-6 mins. How can I improve this? Is there an alternate tool I
> can use?
> ThanksRama

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message