Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DCF0B10F90 for ; Tue, 7 Jan 2014 03:57:09 +0000 (UTC) Received: (qmail 33361 invoked by uid 500); 7 Jan 2014 03:56:48 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 33213 invoked by uid 500); 7 Jan 2014 03:56:42 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 33205 invoked by uid 99); 7 Jan 2014 03:56:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jan 2014 03:56:40 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ranjinibecse@gmail.com designates 209.85.215.51 as permitted sender) Received: from [209.85.215.51] (HELO mail-la0-f51.google.com) (209.85.215.51) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jan 2014 03:56:35 +0000 Received: by mail-la0-f51.google.com with SMTP id ec20so10572044lab.10 for ; Mon, 06 Jan 2014 19:56:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=MLo55h7mn8XOi3tI5GS+hpi/NxLvv4PAx8JVX/NJaJE=; b=MTZ5kAbqneEjh2LafdXCH6C0i5L/z3giMWDsuWRf9ptuO3oSHgWM5XI6HfgXageFVp ETitf0ZEExpQ1OVw4Vmt74OwLwHGeVjOf3ielq45i2hRf+J1Gd664uVl9+X6IJ47vLn9 RCvOj6KcNGRdzJsRH1OflSpb/c/3+xoYw1QgTO+8IydyBL30NC7PGWUMz0YeEN91VU3x GhnmFgkv6OR9ZK5Lbimsa+WwHHkzFsmQjOkhBHsjW+A5UUfgWDlsNzPy5XacdWTGZ6e6 ZEpIpWK2XCk5MCH00MGCn75Co4OwHwGCZsHmXTJykGEKtRQKapNEPqGsBHyU155TpoUe UhNw== MIME-Version: 1.0 X-Received: by 10.112.144.69 with SMTP id sk5mr2378551lbb.44.1389066974152; Mon, 06 Jan 2014 19:56:14 -0800 (PST) Received: by 10.152.131.165 with HTTP; Mon, 6 Jan 2014 19:56:14 -0800 (PST) In-Reply-To: References: Date: Tue, 7 Jan 2014 09:26:14 +0530 Message-ID: Subject: Re: Fine tunning From: Ranjini Rathinam To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b3432f4904d4c04ef595b5d X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3432f4904d4c04ef595b5d Content-Type: text/plain; charset=ISO-8859-1 Hi, I have a table in hbase named currencymaster For Eg: id,currency 1,INR 2,USD Now I am dumping the text file containing currency as one of the field in tableCurrency table in hbase using mapreduce code. if the value from text file of currency field matches with the value of currency master table then need to add one more column in the tableCurrency as Valid_Ind , if values mathes then Valid_Ind value will be "0' and if does not match the value will be "1". I have attached my code. Please suggest why the validation part so long time for just 13250 records. public class MapReduceTable { private static Configuration conf = null; static { Configuration customConf = new Configuration(); //customConf.setStrings ("hbase.zookeeper.quorum"," tss4l20b1.svr.us.jpmchase.net,tss4l20b2.svr.us.jpmchase.net, tss4l20a1.svr.us.jpmchase.net"); customConf.setStrings("hbase.zookeeper.quorum","localhost"); customConf.setLong("hbase.rpc.timeout", 60000000); customConf.setLong("hbase.client.scanner.caching", 60000000); conf = HBaseConfiguration.create(customConf); // customConf = null; } static class Map extends Mapper { protected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException { String messageStr = value.toString(); String[] logRecvArr = messageStr.split(","); Put put1 = new Put(Bytes.toBytes(logRecvArr[0])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("firstName"),Bytes.toBytes(logRecvArr[1])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("middleName"),Bytes.toBytes(logRecvArr[2])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("LastName"),Bytes.toBytes(logRecvArr[3])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("Company"),Bytes.toBytes(logRecvArr[4])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("location"),Bytes.toBytes(logRecvArr[5])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("dept"),Bytes.toBytes(logRecvArr[6])); put1.add(Bytes.toBytes("Id"), Bytes.toBytes("exp"),Bytes.toBytes(logRecvArr[7])); context.write(new Text(logRecvArr[0]), put1); } } static class MyMapper extends Mapper { protected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException { String messageStr = value.toString(); String valid_Ind="";String val=""; String[] logRecvArr = messageStr.split(","); Put put = new Put(Bytes.toBytes(logRecvArr[0])); Filter valFilter = new ValueFilter(CompareFilter.CompareOp.EQUAL, new BinaryComparator(Bytes.toBytes(logRecvArr[15]))); HTable table1 = new HTable(conf, "curr"); Scan s1 = new Scan(); s1.setFilter(valFilter); ResultScanner ss1 = table1.getScanner(s1); for (Result r1 : ss1) { for (KeyValue kv1 : r1.raw()) { if (Bytes.toString(kv1.getQualifier()).equals("currency")) { val = new String(kv1.getValue()); } } } put.add(Bytes.toBytes("Id"), Bytes.toBytes("Address1"),Bytes.toBytes(logRecvArr[8])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("Address2"),Bytes.toBytes(logRecvArr[9])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("Address3"),Bytes.toBytes(logRecvArr[10])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("phone"),Bytes.toBytes(logRecvArr[11])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("mobile"),Bytes.toBytes(logRecvArr[12])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("sal"),Bytes.toBytes(logRecvArr[13])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("Acctno"),Bytes.toBytes(logRecvArr[14])); put.add(Bytes.toBytes("Id"), Bytes.toBytes("currency"),Bytes.toBytes(logRecvArr[15])); if(val.equals(logRecvArr[15])) { put.add(Bytes.toBytes("Id"), Bytes.toBytes("Vaild_Ind"),Bytes.toBytes("0")); } else { put.add(Bytes.toBytes("Id"), Bytes.toBytes("Vaild_Ind"),Bytes.toBytes("1")); } context.write(new Text(logRecvArr[0]), put); } } public int execute() throws Exception { String input="/user/hduser/INPUT/"; Job job = new Job(conf,"TrandferHdfsToUserLog"); job.setJarByClass(MapReduceTable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, input); job.setMapperClass(Map.class); TableMapReduceUtil.initTableReducerJob("RanCount",null,job); job.setNumReduceTasks(0); System.out.println("Hello Hadoop 2nd Job!!"+job.waitForCompletion(true)); return 0; } public int executeLast() throws Exception { String input="/user/hduser/INPUT/"; Job job = new Job(conf,"TrandferHdfsToUserLog"); job.setJarByClass(MapReduceTable.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(Text.class); job.setMapOutputKeyClass(Text.class); job.setMapOutputValueClass(Text.class); job.setInputFormatClass(TextInputFormat.class); job.setOutputFormatClass(TextOutputFormat.class); FileInputFormat.setInputPaths(job, input); job.setMapperClass(MyMapper.class); TableMapReduceUtil.initTableReducerJob("Rancount11",null,job); job.setNumReduceTasks(0); System.out.println("Hello Hadoop 2nd Job!!"+job.waitForCompletion(true)); return 0; } public static void main(String[] args) throws Exception { new MapReduceTable().execute(); new MapReduceTable().executeLast(); } } Thanks in advance. Ranjini On Tue, Jan 7, 2014 at 12:36 AM, Hardik Pandya wrote: > Can you please share how you are doing the lookup? > > > > > On Mon, Jan 6, 2014 at 4:23 AM, Ranjini Rathinam wrote: > >> Hi, >> >> I have a input File of 16 fields in it. >> >> Using Mapreduce code need to load the hbase tables. >> >> The first eight has to go into one table in hbase and last eight has to >> got to another hbase table. >> >> The data is being loaded into hbase table in 0.11 sec , but if any lookup >> is being added in the mapreduce code, >> For eg, the input file has one attribute named currency , it will have a >> master table currency. need to match both values to print it. >> >> The table which has lookup takes long time to get load. For 13250 records >> it take 59 mins. >> >> How to make fine tune to reduce the time for its loading. >> >> Please help. >> >> Thanks in advance. >> >> Ranjini.R >> >> >> > --047d7b3432f4904d4c04ef595b5d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
Hi,
=A0
I have a table in hbase named currencymaster
=A0
For Eg:
id,currency
1,INR
2,USD
=A0
Now I am dumping the text file containing currency as one of the field= in tableCurrency table in hbase using mapreduce code.
=A0
if the value from text file=A0of currency field matches with the value= of currency master table then need to add one more column in the

t= ableCurrency as Valid_Ind , if values mathes=A0then=A0=A0Valid_Ind value wi= ll be "0' and if does not match the value will be "1".
=A0
I have attached my code. Please suggest why the validation part so lon= g time for just 13250 records.
=A0
=A0
=A0
public class MapReduceTable
{
=A0private static Configuration co= nf =3D null;
=A0=A0=A0static {
=A0=A0=A0=A0=A0=A0Configuration custom= Conf =3D new Configuration();
=A0=A0=A0=A0=A0=A0//customConf.setStrings= =A0("hbase.zookeeper.quorum","tss4l20b1.svr.us.jpmchase.net,tss4l20b2.svr.us.jpmchase.net,tss4l20a1.svr.us.jpmchase.net")= ;
=A0=A0=A0=A0=A0=A0customConf.setStrings("hbase.zookeeper.quorum",= "localhost");
=A0=A0=A0=A0=A0=A0customConf.setLong("hbase= .rpc.timeout", 60000000);
=A0=A0=A0=A0=A0=A0customConf.setLong(&quo= t;hbase.client.scanner.caching", 60000000);
=A0=A0=A0=A0=A0=A0conf =3D HBaseConfiguration.create(customConf);
=A0=A0= // customConf =3D null;
=A0=A0=A0=A0}
=A0=A0=A0static class Map extends Mapper<LongWrita= ble, Text, Text, Put>
=A0=A0=A0{
=A0=A0=A0=A0protected void map(L= ongWritable key, Text value, Context context)throws IOException, Interrupte= dException
=A0=A0 =A0=A0=A0{
=A0=A0=A0=A0String messageStr =3D value.toString();
=A0=A0=A0=A0String[] logRecvArr =3D messageStr.split(",");
=A0=A0=A0=A0Put put1 =3D new Put(Bytes.toBytes(logRecvArr[0]));
=A0= =A0=A0=A0put1.add(Bytes.toBytes("Id"), Bytes.toBytes("firstN= ame"),Bytes.toBytes(logRecvArr[1]));
=A0=A0=A0=A0put1.add(Bytes.toB= ytes("Id"), Bytes.toBytes("middleName"),Bytes.toBytes(l= ogRecvArr[2]));
=A0=A0=A0=A0put1.add(Bytes.toBytes("Id"), Bytes.toBytes("Las= tName"),Bytes.toBytes(logRecvArr[3]));
=A0=A0=A0=A0put1.add(Bytes.t= oBytes("Id"), Bytes.toBytes("Company"),Bytes.toBytes(lo= gRecvArr[4]));
=A0=A0=A0=A0put1.add(Bytes.toBytes("Id"), Bytes.toBytes("loc= ation"),Bytes.toBytes(logRecvArr[5]));
=A0=A0=A0=A0put1.add(Bytes.t= oBytes("Id"), Bytes.toBytes("dept"),Bytes.toBytes(logRe= cvArr[6]));
=A0=A0=A0=A0put1.add(Bytes.toBytes("Id"), Bytes.toBytes("exp= "),Bytes.toBytes(logRecvArr[7]));
=A0=A0=A0=A0context.write(new Text(logRecvArr[0]), put1);
=A0=A0=A0= =A0=A0=A0=A0=A0=A0 }
=A0=A0=A0=A0=A0=A0=A0=A0 }
=A0=A0=A0static class MyMapper extends Mapper<LongWritable, Text, T= ext, Put>
=A0=A0=A0{
=A0=A0=A0=A0protected void map(LongWritable key, Text value, Context c= ontext)throws IOException, InterruptedException
=A0=A0=A0=A0{
=A0=A0=A0=A0=A0String messageStr =3D value.toString();
=A0=A0=A0=A0= =A0String valid_Ind=3D"";String val=3D"";
=A0=A0=A0= =A0=A0String[] logRecvArr =3D messageStr.split(",");
=A0=A0=A0= =A0=A0Put put =3D new Put(Bytes.toBytes(logRecvArr[0]));
=A0=A0=A0=A0=A0Filter valFilter =3D new ValueFilter(CompareFilter.Comp= areOp.EQUAL, new BinaryComparator(Bytes.toBytes(logRecvArr[15])));
=A0= =A0=A0=A0=A0HTable table1 =3D new HTable(conf, "curr");
=A0=A0= =A0=A0=A0Scan s1 =3D new Scan();
=A0=A0=A0=A0=A0s1.setFilter(valFilter);
=A0=A0=A0=A0=A0ResultScanner ss1= =3D table1.getScanner(s1);
=A0=A0=A0=A0=A0=A0for (Result r1 : ss1)
=A0=A0=A0=A0=A0=A0{
=A0= =A0=A0=A0=A0=A0=A0for (KeyValue kv1 : r1.raw())
=A0=A0=A0=A0=A0=A0=A0{=A0=A0=A0=A0=A0=A0=A0=A0if (Bytes.toString(kv1.getQualifier()).equals(&qu= ot;currency"))
=A0=A0=A0=A0=A0=A0=A0=A0{
=A0=A0=A0=A0=A0=A0=A0=A0=A0val =3D new String(kv1.getValue());
=A0= =A0=A0=A0=A0=A0=A0=A0}=A0=A0
=A0=A0=A0=A0=A0=A0=A0}
=A0=A0=A0=A0=A0 }
=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("= Id"), Bytes.toBytes("Address1"),Bytes.toBytes(logRecvArr[8])= );
=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("Id"), Bytes.toB= ytes("Address2"),Bytes.toBytes(logRecvArr[9]));
=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("Id"), Bytes.toBytes(&= quot;Address3"),Bytes.toBytes(logRecvArr[10]));
=A0=A0=A0=A0=A0=A0= =A0put.add(Bytes.toBytes("Id"), Bytes.toBytes("phone"),= Bytes.toBytes(logRecvArr[11]));
=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("Id"), Bytes.toBytes(&= quot;mobile"),Bytes.toBytes(logRecvArr[12]));
=A0=A0=A0=A0=A0=A0=A0= put.add(Bytes.toBytes("Id"), Bytes.toBytes("sal"),Bytes= .toBytes(logRecvArr[13]));
=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("Id"), Bytes.toBytes(&= quot;Acctno"),Bytes.toBytes(logRecvArr[14]));
=A0=A0=A0=A0=A0=A0=A0= put.add(Bytes.toBytes("Id"), Bytes.toBytes("currency"),= Bytes.toBytes(logRecvArr[15]));
=A0=A0=A0=A0=A0=A0=A0
=A0=A0=A0=A0=A0=A0=A0 =A0=A0if(val.equals(logRecvA= rr[15]))
=A0=A0=A0=A0=A0=A0=A0=A0=A0{
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0p= ut.add(Bytes.toBytes("Id"), Bytes.toBytes("Vaild_Ind"),= Bytes.toBytes("0"));
=A0=A0=A0=A0=A0=A0=A0=A0=A0}
=A0=A0=A0= =A0=A0=A0=A0=A0else
=A0=A0=A0=A0=A0=A0=A0=A0=A0{
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0put.add(Bytes.toBytes("Id"), Bytes.= toBytes("Vaild_Ind"),Bytes.toBytes("1"));
=A0=A0=A0= =A0=A0=A0=A0=A0=A0}
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0context.write(new Text(logRecvArr[0]), p= ut);
=A0=A0=A0=A0}
=A0=A0=A0}
=A0=A0=A0=A0=A0=A0public int execute() throws Exception
=A0=A0=A0= =A0=A0=A0 {
=A0=A0=A0=A0=A0=A0=A0String input=3D"/user/hduser/INPUT= /";
=A0=A0=A0=A0=A0=A0=A0Job job =3D new Job(conf,"TrandferHdf= sToUserLog");
=A0=A0=A0=A0=A0=A0=A0job.setJarByClass(MapReduceTable= .class);
=A0=A0=A0=A0=A0=A0=A0job.setOutputKeyClass(Text.class);
=A0=A0=A0=A0=A0= =A0=A0job.setOutputValueClass(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setM= apOutputKeyClass(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setMapOutputValue= Class(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setInputFormatClass(TextInpu= tFormat.class);
=A0=A0=A0=A0=A0=A0=A0job.setOutputFormatClass(TextOutputFormat.class);
= =A0=A0=A0=A0=A0=A0=A0FileInputFormat.setInputPaths(job, input);
=A0=A0= =A0=A0=A0=A0=A0job.setMapperClass(Map.class);
=A0=A0=A0=A0=A0=A0=A0Table= MapReduceUtil.initTableReducerJob("RanCount",null,job);
=A0=A0=A0=A0=A0=A0=A0job.setNumReduceTasks(0);
=A0=A0=A0=A0=A0=A0=A0Sys= tem.out.println("Hello Hadoop 2nd Job!!"+job.waitForCompletion(tr= ue));
=A0=A0=A0=A0=A0=A0=A0return 0;
=A0=A0=A0=A0=A0=A0}
=A0=A0=A0=A0=A0=A0public int executeLast() throws Exception
=A0=A0= =A0=A0=A0=A0 {
=A0=A0=A0=A0=A0=A0=A0String input=3D"/user/hduser/IN= PUT/";=A0
=A0=A0=A0=A0=A0=A0=A0Job job =3D new Job(conf,"Trand= ferHdfsToUserLog");
=A0=A0=A0=A0=A0=A0=A0job.setJarByClass(MapReduc= eTable.class);
=A0=A0=A0=A0=A0=A0=A0job.setOutputKeyClass(Text.class);
=A0=A0=A0=A0=A0= =A0=A0job.setOutputValueClass(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setM= apOutputKeyClass(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setMapOutputValue= Class(Text.class);
=A0=A0=A0=A0=A0=A0=A0job.setInputFormatClass(TextInpu= tFormat.class);
=A0=A0=A0=A0=A0=A0=A0job.setOutputFormatClass(TextOutputFormat.class);
= =A0=A0=A0=A0=A0=A0=A0FileInputFormat.setInputPaths(job, input);
=A0=A0= =A0=A0=A0=A0=A0job.setMapperClass(MyMapper.class);
=A0=A0=A0=A0=A0=A0=A0= TableMapReduceUtil.initTableReducerJob("Rancount11",null,job); =A0=A0=A0=A0=A0=A0=A0job.setNumReduceTasks(0);
=A0=A0=A0=A0=A0=A0=A0Syst= em.out.println("Hello Hadoop 2nd Job!!"+job.waitForCompletion(tru= e));
=A0=A0=A0=A0=A0=A0=A0return 0;
=A0=A0=A0=A0=A0=A0}
=A0
=A0=A0=A0=A0=A0 public static void main(String[] args) throws E= xception
=A0=A0=A0=A0=A0=A0{
=A0=A0=A0=A0=A0=A0=A0new MapReduceTable= ().execute();
=A0=A0=A0=A0=A0=A0=A0new MapReduceTable().executeLast();=A0=A0=A0=A0=A0=A0}
}
=A0
Thanks in advance.
=A0
Ranjini
=A0
On Tue, Jan 7, 2014 at 12:36 AM, Hardik Pandya <= span dir=3D"ltr"><smarty.juice@gmail.com> wrote:
Can you please share how you are doing the lookup?=20




On Mon, Jan 6, 2014 at 4:23 AM, Ranjini Rathinam= <ranjinibecse@gmail.com> wrote:
Hi,
=A0
I have a input File of 16 fields in it.
=A0
Using Mapreduce code need to load the hbase tables.
=A0
The first eight has to go into one table in hbase and last eight has t= o got to another hbase table.
=A0
The data is being loaded into hbase table in 0.11 sec , but if any loo= kup is being added in the mapreduce code,
For eg, the input file has=A0one =A0attribute=A0named currency , it wi= ll have a master table currency. need to match both values to print it.
=A0
The table which has lookup takes long time to get load. For 13250 reco= rds it take 59 mins.
=A0
How to make fine tune to reduce the time for its loading.
=A0
Please help.
=A0
Thanks in advance.
=A0
Ranjini.R
=A0



--047d7b3432f4904d4c04ef595b5d--