hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shekhar Sharma <shekhar2...@gmail.com>
Subject Re: Reg:Hive query with mapreduce
Date Thu, 20 Feb 2014 17:26:42 GMT
Assuming you are using TextInputFormat and your data set is comma separated
value , where secondColumn is empId third column is salary, then your
mapfunction would look like this



public class FooMapper extends Mapper<LongWritable,Text,Text,NullWritable>
{


public void map(LongWritable offset, Text empRecord, Context context)
{
   String[]  splits = empRecord.toString().split(",");
   double salary = Double.parseDouble(splits[2]);
   if(salary > 120000)
{
  context.write(new Text(splits[1],null);
}

}


set the number of reducer tasks to zero.

No of output files would be equal to number of map tasks in this case and
if you want to have single output file then

(1) Set the mapred.min.split.size=<Equal to file size or some bigger value
like Long.MAX_VALUE>>. It will spawn only one mapper task and you will get
one output file



}

Regards,
Som Shekhar Sharma
+91-8197243810


On Thu, Feb 20, 2014 at 5:55 PM, Ranjini Rathinam <ranjinibecse@gmail.com>wrote:

> Hi,
>
> How to implement the Hive query such as
>
> select * from table comp;
>
> select empId from comp where sal>12000;
>
> in mapreduce.
>
> Need to use this query in mapreduce code. How to implement the above query
> in the code using mapreduce , JAVA.
>
>
> Please provide the sample code.
>
> Thanks in advance for the support
>
> Regards
>
> Ranjini
>
>
>
>
>

Mime
View raw message