hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From peterramesh <ramesh.ramas...@gmail.com>
Subject Re: Map Reduce performance
Date Fri, 26 Jun 2009 04:46:45 GMT

Hi Eric/Tim,

Thanks for your appreciable points.

I have updated the Mapper implementation removing Htable instance, as
follows

public static class InnerMapWithTOF extends MapReduceBase implements
			Mapper<LongWritable, Text, ImmutableBytesWritable, BatchUpdate> {

		public void map(LongWritable key, Text value,
				OutputCollector<ImmutableBytesWritable, BatchUpdate> output,
				Reporter reporter) throws IOException {

			String[] splits = value.toString().split("\t");
			BatchUpdate bu = new BatchUpdate(splits[0]);

			int j = 0;
			while (j < HBaseTest.SNP_INFO_COLUMN_NAMES.length) {
				bu.put(HBaseTest.SNP_FAMILY_NAMES[0]
						+ HBaseTest.SNP_INFO_COLUMN_NAMES[j], new String(
						splits[j].getBytes()).getBytes());
				j++;
			}

			output
					.collect(new ImmutableBytesWritable(splits[0].getBytes()),
							bu);
		}

	}

But,  in the able code I'm reading same key value
(HBaseTest.SNP_FAMILY_NAMES[0] 	+ HBaseTest.SNP_INFO_COLUMN_NAMES[j])
sequentially all column family for each record.  Is there any way to set it
in the JobConf object or etc..

This TableReduce implementation does insert the records into the HTable, as
follows

	public static class InnerReduceWithTOF extends MapReduceBase implements
			TableReduce<ImmutableBytesWritable, BatchUpdate> {

		public void reduce(ImmutableBytesWritable key,
				Iterator<BatchUpdate> value,
				OutputCollector<ImmutableBytesWritable, BatchUpdate> output,
				Reporter reporter) throws IOException {

			while (value.hasNext()) {
				output.collect(key, value.next());
			}

		}
	}

and here is the Configuration..

		JobConf c = new JobConf(getConf(), MapReduceHBaseTest.class);
		c.setJobName("ConfMapReduce2");
		FileInputFormat.setInputPaths(c, new Path("snp.txt"));

		c.setMapperClass(InnerMapWithTOF.class);
		c.setReducerClass(InnerReduceWithTOF.class);
		c.setOutputFormat(TableOutputFormat.class);
		c.set(TableOutputFormat.OUTPUT_TABLE, "snp");

		c.setOutputKeyClass(ImmutableBytesWritable.class);
		c.setOutputValueClass(BatchUpdate.class);

		c.setMapOutputKeyClass(ImmutableBytesWritable.class);
		c.setMapOutputValueClass(BatchUpdate.class);

		int partitioner = c.getNumMapTasks();

		System.out.println(partitioner);
		System.out.println(c.getNumReduceTasks());

		TableMapReduceUtil.initTableReduceJob("snp",
				InnerReduceWithTOF.class, c);

		JobClient.runJob(c);


TIA,
Ramesh
-- 
View this message in context: http://www.nabble.com/Map-Reduce-performance-tp24166190p24214918.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message