hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick maillard <nicolas.maill...@fifty-five.com>
Subject Re: Hbase import Tsv performance (slow import)
Date Wed, 24 Oct 2012 14:35:31 GMT
Hello everyone 

Still looking in the issue.
I have tried different tests and the results are surprising.
If I put mapred.tasktracker.map.tasks.maximum: 28
I get a total of 84 tasks on my cluster and the process takes about 1h15 min
each task taking up 1h10 minutes. The whole file being cut down in 80 tasks.

If I put  mapred.tasktracker.map.tasks.maximum: 3
I get a total of 6 tasks on my cluster and the process takes about the same
amount of time 1h20 still cutting down the whole file in 80 tasks, but now of
course each individual task only takes up a couple of minutes.

It's like the overall importTSv must take 1h something and the duration of the
map tasks vary accordingly.

There is definitly something I am doing wrong.

View raw message