hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rama Ramani <rama.ram...@live.com>
Subject RE: HBase - bulk loading files
Date Fri, 09 Jan 2015 20:53:52 GMT
Is there a way to specify Salted buckets with HBase ImportTsv while doing bulk load?
 
Thanks
Rama
 
From: rama.ramani@live.com
To: user@hbase.apache.org
Subject: RE: HBase - bulk loading files
Date: Fri, 19 Dec 2014 14:09:09 -0800




0.98.0.2.1.9.0-2196-hadoop2Hadoop 2.4.0.2.1.9.0-2196Subversion git@github.com:hortonworks/hadoop-monarch.git
-r cb50542bc92fb77dee52
No, the clusters were not taking additional load.
ThanksRama
> Date: Fri, 19 Dec 2014 13:50:30 -0800
> Subject: Re: HBase - bulk loading files
> From: yuzhihong@gmail.com
> To: user@hbase.apache.org
> 
> Can you let us know the HBase and hadoop versions you're using ?
> 
> Were the clusters taking load from other sources when ImportTsv was running
> ?
> 
> Cheers
> 
> On Fri, Dec 19, 2014 at 1:43 PM, Rama Ramani <rama.ramani@live.com> wrote:
> 
> > Hello,         I am bulk loading a set of files (about 400MB each) with
> > "|" as the delimiter using ImportTsv. It takes a long time for the 'map'
> > job to complete on both a 4 node and a 16 node cluster. I tried the option
> > to generate the output (providing -Dimporttsv.bulk.output) which took time
> > indicating that the generation of the output files needs improvement.
> > I am seeing about 8000 rows / sec for this dataset, the 400MB ingestion
> > takes about 5-6 mins. How can I improve this? Is there an alternate tool I
> > can use?
> > ThanksRama
 		 	   		   		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message