Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D2CFD10C43 for ; Fri, 17 Jan 2014 10:17:52 +0000 (UTC) Received: (qmail 45738 invoked by uid 500); 17 Jan 2014 10:17:45 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 44878 invoked by uid 500); 17 Jan 2014 10:17:43 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 44871 invoked by uid 99); 17 Jan 2014 10:17:42 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jan 2014 10:17:42 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of unmeshabiju@gmail.com designates 209.85.212.42 as permitted sender) Received: from [209.85.212.42] (HELO mail-vb0-f42.google.com) (209.85.212.42) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Jan 2014 10:17:38 +0000 Received: by mail-vb0-f42.google.com with SMTP id i3so1051238vbh.15 for ; Fri, 17 Jan 2014 02:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=ECgqHD5i2+MYiz+TsbG6unWp15Hq317vVl+sYpz5fSQ=; b=A5Fqt6eLxXtWwVj4uivmIHWLokwVRWSD3rgdCutnrNKVypkW5RcLaCFOpwQKdQ+f/Q x0ki1seSP3YH0pihEOilR/lotlYE+1H5XnJVs7tdBgoqFTdApYAp6U99jKZ4ZhKFVbsr LoRUiCgwHzpHVT1M0s06KtTJ1zvSYUqsVs76lKHR0dR9BiyCmH3dFAeREzgcJZuPj0iK ykCPtUSXnpmmQWZWGRSQcLwpoZLlIFm3gEc7hsTy8VEDp4PPOQuvy4Szx97kftZARZqU exmReCThBAd8I2tOn9QK+SnWL+hmkuXyerAMBtK8vgrxHoVp/K3JEWmzgWUdEH+u+rTn aP4w== X-Received: by 10.58.200.168 with SMTP id jt8mr523988vec.30.1389953837468; Fri, 17 Jan 2014 02:17:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.59.8.2 with HTTP; Fri, 17 Jan 2014 02:16:37 -0800 (PST) In-Reply-To: References: From: unmesha sreeveni Date: Fri, 17 Jan 2014 15:46:37 +0530 Message-ID: Subject: Re: Sorting a csv file To: User Hadoop Content-Type: multipart/alternative; boundary=047d7bd6bd76bc911004f027d885 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd6bd76bc911004f027d885 Content-Type: text/plain; charset=ISO-8859-1 are we able to sort multiple columns dynamically as the user suggests? ie user requests to sort col1 and col2 then the user request to sort 3 cols I am not able to find anyof the stuff through googling On Thu, Jan 16, 2014 at 4:03 PM, unmesha sreeveni wrote: > yes i did .. > But how to make it in decending order? > > My current code run in accending order > > *public class SortingCsv {* > * public static class Map extends Mapper > {* > * private Text word = new Text();* > * private Text one = new Text();* > > * public void map(LongWritable key, Text value, Context context) throws > IOException, InterruptedException {* > * System.out.println("in mapper");* > * /** > * * sort* > * */* > * ArrayList ar = new ArrayList(); * > * String line = value.toString();* > * String[] tokens = null;* > * ar.add(line);* > * System.out.println("list: "+ar);* > * for(int i=0;i * tokens=(ar.get(i)).split(",");* > * System.out.println("ele: "+ar.get(i));* > * System.out.println("token: "+tokens[1]); //change according > to user input* > * word.set(tokens[1]);* > * one.set(ar.get(i));* > * context.write(word, one);* > * }* > * }* > * } * > * public static void main(String[] args) throws Exception {* > * System.out.println("in main");* > * Configuration conf = new Configuration();* > > * Job job = new Job(conf, "wordcount");* > * job.setJarByClass(SortingCsv.class);* > * //Path intermediateInfo = new Path("out");* > * job.setOutputKeyClass(Text.class);* > * job.setOutputValueClass(Text.class);* > > * job.setMapperClass(Map.class);* > * FileSystem fs = FileSystem.get(conf);* > > * /* Delete the files if any in the output path */* > > * if (fs.exists(new Path(args[1])))* > * fs.delete(new Path(args[1]), true);* > > > * job.setInputFormatClass(TextInputFormat.class);* > * job.setOutputFormatClass(TextOutputFormat.class);* > > * FileInputFormat.addInputPath(job, new Path(args[0]));* > * FileOutputFormat.setOutputPath(job, new Path(args[1]));* > > * job.waitForCompletion(true);* > * }* > > > > On Thu, Jan 16, 2014 at 10:26 AM, unmesha sreeveni wrote: > >> Thanks for ur reply Ramya >> ok :) .so should i need to transpose the entire .csv file inorder to get >> the entire col 2 data? >> >> >> On Thu, Jan 16, 2014 at 10:11 AM, Ramya S wrote: >> >>> Try to keep col2 values as map output key and map output value as the >>> total values " b,a,v " >>> >>> >>> >>> Regards... >>> Ramya.S >>> >>> >>> >>> ________________________________ >>> >>> From: unmesha sreeveni [mailto:unmeshabiju@gmail.com] >>> Sent: Thu 1/16/2014 9:29 AM >>> To: User Hadoop >>> Subject: Re: Sorting a csv file >>> >>> >>> Thanks Ramya.s >>> I was trying it to do with NULLWRITABLE.. >>> >>> Thanks alot Ramya. >>> >>> And do u have any idea how to sort a given col. >>> Say if user is giving col2 to sort the i want to get as >>> b,a,v >>> a,c,p >>> d,a,z >>> q,z,a >>> r,a,b >>> >>> b,a,v >>> d,a,z >>> r,a,b >>> >>> a,c,p >>> >>> q,z,a >>> >>> How do i approch to that. >>> >>> I my current implementation i am getting >>> result as >>> a,c,p >>> b,a,v >>> d,a,z >>> q,z,a >>> r,a,b >>> >>> >>> using the above code. >>> >>> >>> On Wed, Jan 15, 2014 at 5:09 PM, Ramya S wrote: >>> >>> >>> All you need is to change the map output value class to TEXT >>> format. >>> Set this accordingly in the main. >>> >>> Eg: >>> >>> public static class Map extends Mapper>> Text> { >>> private Text one = new Text(""); >>> >>> private Text word = new Text(); >>> >>> public void map(LongWritable key, Text value, Context >>> context) throws IOException, InterruptedException { >>> System.out.println("in mapper"); >>> String line = value.toString(); >>> StringTokenizer tokenizer = new StringTokenizer(line); >>> while (tokenizer.hasMoreTokens()) { >>> word.set(tokenizer.nextToken()); >>> context.write(word, one); >>> System.out.println("sort: "+word); >>> } >>> } >>> } >>> >>> >>> Regards...? >>> Ramya.S >>> >>> >>> ________________________________ >>> >>> From: unmesha sreeveni [mailto:unmeshabiju@gmail.com] >>> Sent: Wed 1/15/2014 4:11 PM >>> To: User Hadoop >>> Subject: Re: Sorting a csv file >>> >>> >>> >>> I did a map only job for sorting a txt file by editing wordcount >>> program. >>> I only need the key . >>> How to set value to null. >>> >>> >>> public class SortingCsv { >>> public static class Map extends Mapper>> IntWritable> { >>> private final static IntWritable one = new IntWritable(1); >>> private Text word = new Text(); >>> >>> public void map(LongWritable key, Text value, Context >>> context) throws IOException, InterruptedException { >>> System.out.println("in mapper"); >>> String line = value.toString(); >>> StringTokenizer tokenizer = new StringTokenizer(line); >>> while (tokenizer.hasMoreTokens()) { >>> word.set(tokenizer.nextToken()); >>> context.write(word, one); >>> System.out.println("sort: "+word); >>> } >>> } >>> } >>> public static void main(String[] args) throws Exception { >>> System.out.println("in main"); >>> Configuration conf = new Configuration(); >>> >>> Job job = new Job(conf, "wordcount"); >>> job.setJarByClass(SortingCsv.class); >>> //Path intermediateInfo = new Path("out"); >>> job.setOutputKeyClass(Text.class); >>> job.setOutputValueClass(IntWritable.class); >>> >>> job.setMapperClass(Map.class); >>> FileSystem fs = FileSystem.get(conf); >>> >>> /* Delete the files if any in the output path */ >>> >>> if (fs.exists(new Path(args[1]))) >>> fs.delete(new Path(args[1]), true); >>> >>> >>> job.setInputFormatClass(TextInputFormat.class); >>> job.setOutputFormatClass(TextOutputFormat.class); >>> >>> FileInputFormat.addInputPath(job, new Path(args[0])); >>> FileOutputFormat.setOutputPath(job, new Path(args[1])); >>> >>> job.waitForCompletion(true); >>> } >>> >>> } >>> >>> >>> On Wed, Jan 15, 2014 at 2:50 PM, unmesha sreeveni < >>> unmeshabiju@gmail.com> wrote: >>> >>> >>> How to sort a csv file >>> I know , between map and reduce shuffle and sort is >>> taking place. >>> But how do i sort each column in a csv file? >>> >>> >>> -- >>> >>> Thanks & Regards >>> >>> >>> Unmesha Sreeveni U.B >>> >>> Junior Developer >>> >>> http://www.unmeshasreeveni.blogspot.in/ >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Thanks & Regards >>> >>> >>> Unmesha Sreeveni U.B >>> >>> Junior Developer >>> >>> http://www.unmeshasreeveni.blogspot.in/ >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Thanks & Regards >>> >>> >>> Unmesha Sreeveni U.B >>> >>> Junior Developer >>> >>> http://www.unmeshasreeveni.blogspot.in/ >>> >>> >>> >>> >>> >> >> >> -- >> *Thanks & Regards* >> >> Unmesha Sreeveni U.B >> Junior Developer >> >> http://www.unmeshasreeveni.blogspot.in/ >> >> >> > > > -- > *Thanks & Regards* > > Unmesha Sreeveni U.B > Junior Developer > > http://www.unmeshasreeveni.blogspot.in/ > > > -- *Thanks & Regards* Unmesha Sreeveni U.B Junior Developer http://www.unmeshasreeveni.blogspot.in/ --047d7bd6bd76bc911004f027d885 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
are we able to sort multiple columns dynamically as the user su= ggests?
ie user requests to sort col1 and col2
then the user request to sort 3 cols=
I am not able to find anyof the stuff through googling


On Thu, Jan 16, 2014 at 4:03 PM, unmesha sreeveni <unmeshabiju@gmail.com> wrote:
yes i did ..
But how to make it in decending order?

My current code run in accending order

public c= lass SortingCsv {
public static class Map extends Mapper= <LongWritable, Text, Text, Text> {
= =A0 =A0private Text word =3D new Text();
=A0 =A0private Text one =3D new Text();<= /i>
=A0=A0
=A0 =A0public void map(LongWritable key, Text value= , Context context) throws IOException, InterruptedException {
=A0 =A0 System.out.println("in mapper");
<= /div>
= =A0 =A0 /*<= /div>
=A0 =A0 * sort
=A0 =A0 */
= =A0 =A0 ArrayList<String>= ; ar =3D new ArrayList<String>();=A0
=A0 =A0 String line =3D value.toString();
= =A0 =A0 String[] toke= ns =3D null;
=A0 =A0<= span style=3D"white-space:pre-wrap"> ar.add(line);
=A0 =A0 System.out.println("list: "+ar);
=A0 =A0 for(int i=3D0;i<ar.size();i++) {
= =A0 =A0 =A0 =A0 =A0 =A0tokens=3D(ar.get(i)).split(",");<= /i>
=A0 =A0 =A0 =A0 =A0 = =A0System.out.println("ele: "+ar.get(i));
=A0 =A0 =A0 =A0 =A0 =A0System.out.pri= ntln("token: "+tokens[1]); //change according to user input
=A0 =A0 =A0 =A0 =A0 =A0word.set(token= s[1]);
=A0 =A0 =A0 = =A0 =A0 =A0one.set(ar.get(i));
=A0 =A0 =A0 =A0 =A0 =A0context.write(= word, one);
=A0 = =A0 =A0 =A0 }
=A0 =A0}
}=A0
public static void main(String[] args= ) throws Exception {
= System.out.println("in main");
=A0 =A0Configuration conf =3D new Con= figuration();
=A0 =A0= =A0 =A0
=A0 =A0 =A0 =A0Job job =3D new Job(co= nf, "wordcount");
<= /span> =A0 =A0 =A0 =A0job.setJarByClass(SortingCsv.class);
=A0 =A0 =A0 =A0//Path intermediateInf= o =3D new Path("out");
=A0 =A0job.setOutputKeyClass(Text.class);
=A0 =A0job.setOutputValueClass(= Text.class);
= =A0 =A0 =A0 =A0
=A0 =A0job.setMapperClass(Map.class);=
=A0 =A0FileSystem fs= =3D FileSystem.get(conf);

= /* Delete the files if any in the output path */
<= div class=3D"gmail_default">
<= /font>
if (fs.exists(new Path(args[1])))=
fs.delete(new Path(args= [1]), true);

=
=A0 =A0 =A0 =A0
= =A0 =A0job.setInputFormatClass(TextInputFormat.class);<= /div>
=A0 =A0job.setOutputFormatClass(= TextOutputFormat.class);
=A0 =A0 =A0 =A0
=A0 =A0FileInputFormat.addInputPath(job,= new Path(args[0]));
=A0 =A0FileOutputFormat.setOutputPath= (job, new Path(args[1]));
=A0 =A0 =A0 =A0
=A0 =A0job.waitForCompletion(true);
}
=A0 =A0


On Thu, Jan 16, 2014 at 10:26 AM, unmesha sreeveni <unmeshabiju@gmail.= com> wrote:
Thanks for ur reply Ramya
<= div class=3D"gmail_default" style=3D"font-family:verdana,sans-serif"> ok :) .so should i need to transpose the entire .csv file inorder to get th= e entire col 2 data?


On Thu, Jan 16, 2014 at 10:11 AM, Ramya S <ramyas@suntecgroup.com= > wrote:
Try to keep col2 values as =A0map output key= =A0and map output value as the total values " b,a,v "



Regards...
Ramya.S



________________________________

From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
Sent: Thu 1/16/2014 9:29 AM
To: User Hadoop
Subject: Re: Sorting a csv file


Thanks Ramya.s
I was trying it to do with NULLWRITABLE..

Thanks alot Ramya.

And do u have any idea how to sort a given col.
Say if user is giving col2 to sort the i want to get as
b,a,v
a,c,p
d,a,z
q,z,a
r,a,b

b,a,v
d,a,z
r,a,b

a,c,p

q,z,a

How do i approch to that.

I my current implementation i am getting
result as
a,c,p
b,a,v
d,a,z
q,z,a
r,a,b


using the above code.


On Wed, Jan 15, 2014 at 5:09 PM, Ramya S <ramyas@suntecgroup.com> wrote:


=A0 =A0 =A0 =A0 All you need is to change the map output value class to TEX= T format.
=A0 =A0 =A0 =A0 Set this accordingly in the main.

=A0 =A0 =A0 =A0 Eg:

=A0 =A0 =A0 =A0 public static class Map extends Mapper<LongWritable, Tex= t, Text, Text> {
=A0 =A0 =A0 =A0 =A0 =A0private Text one =3D new Text("");

=A0 =A0 =A0 =A0 =A0 =A0private Text word =3D new Text();

=A0 =A0 =A0 =A0 =A0 =A0public void map(LongWritable key, Text value, Contex= t context) throws IOException, InterruptedException {
=A0 =A0 =A0 =A0 =A0 =A0 System.out.println("in mapper");
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0String line =3D value.toString();
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0StringTokenizer tokenizer =3D new StringToke= nizer(line);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0while (tokenizer.hasMoreTokens()) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0word.set(tokenizer.nextToken());
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0context.write(word, one);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0System.out.println("sort: "= ;+word);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0 }


=A0 =A0 =A0 =A0 Regards...?
=A0 =A0 =A0 =A0 Ramya.S


=A0 =A0 =A0 =A0 ________________________________

=A0 =A0 =A0 =A0 From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
=A0 =A0 =A0 =A0 Sent: Wed 1/15/2014 4:11 PM
=A0 =A0 =A0 =A0 To: User Hadoop
=A0 =A0 =A0 =A0 Subject: Re: Sorting a csv file



=A0 =A0 =A0 =A0 I did a map only job for sorting a txt file by editing word= count program.
=A0 =A0 =A0 =A0 I only need the key .
=A0 =A0 =A0 =A0 How to set value to null.


=A0 =A0 =A0 =A0 public class SortingCsv {
=A0 =A0 =A0 =A0 public static class Map extends Mapper<LongWritable, Tex= t, Text, IntWritable> {
=A0 =A0 =A0 =A0 =A0 =A0private final static IntWritable one =3D new IntWrit= able(1);
=A0 =A0 =A0 =A0 =A0 =A0private Text word =3D new Text();

=A0 =A0 =A0 =A0 =A0 =A0public void map(LongWritable key, Text value, Contex= t context) throws IOException, InterruptedException {
=A0 =A0 =A0 =A0 =A0 =A0 System.out.println("in mapper");
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0String line =3D value.toString();
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0StringTokenizer tokenizer =3D new StringToke= nizer(line);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0while (tokenizer.hasMoreTokens()) {
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0word.set(tokenizer.nextToken());
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0context.write(word, one);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0System.out.println("sort: "= ;+word);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0 =A0 =A0}
=A0 =A0 =A0 =A0 }
=A0 =A0 =A0 =A0 public static void main(String[] args) throws Exception { =A0 =A0 =A0 =A0 System.out.println("in main");
=A0 =A0 =A0 =A0 =A0 =A0Configuration conf =3D new Configuration();

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Job job =3D new Job(conf, "wordcount&qu= ot;);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0job.setJarByClass(SortingCsv.class);
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0//Path intermediateInfo =3D new Path("o= ut");
=A0 =A0 =A0 =A0 =A0 =A0job.setOutputKeyClass(Text.class);
=A0 =A0 =A0 =A0 =A0 =A0job.setOutputValueClass(IntWritable.class);

=A0 =A0 =A0 =A0 =A0 =A0job.setMapperClass(Map.class);
=A0 =A0 =A0 =A0 =A0 =A0FileSystem fs =3D FileSystem.get(conf);

=A0 =A0 =A0 =A0 /* Delete the files if any in the output path */

=A0 =A0 =A0 =A0 if (fs.exists(new Path(args[1])))
=A0 =A0 =A0 =A0 fs.delete(new Path(args[1]), true);


=A0 =A0 =A0 =A0 =A0 =A0job.setInputFormatClass(TextInputFormat.class);
=A0 =A0 =A0 =A0 =A0 =A0job.setOutputFormatClass(TextOutputFormat.class);
=A0 =A0 =A0 =A0 =A0 =A0FileInputFormat.addInputPath(job, new Path(args[0]))= ;
=A0 =A0 =A0 =A0 =A0 =A0FileOutputFormat.setOutputPath(job, new Path(args[1]= ));

=A0 =A0 =A0 =A0 =A0 =A0job.waitForCompletion(true);
=A0 =A0 =A0 =A0 }

=A0 =A0 =A0 =A0 }


=A0 =A0 =A0 =A0 On Wed, Jan 15, 2014 at 2:50 PM, unmesha sreeveni <unmeshabiju@gmail.com= > wrote:


=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 How to sort a csv file
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 I know , between map and reduce shuffle and= sort is taking place.
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 But how do i sort each column in a csv file= ?


=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 --

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Thanks & Regards


=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Unmesha Sreeveni U.B

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Junior Developer

=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 http://www.unmeshasreeveni.blogspot.in/








=A0 =A0 =A0 =A0 --

=A0 =A0 =A0 =A0 Thanks & Regards


=A0 =A0 =A0 =A0 Unmesha Sreeveni U.B

=A0 =A0 =A0 =A0 Junior Developer

=A0 =A0 =A0 =A0 http://www.unmeshasreeveni.blogspot.in/









--

Thanks & Regards


Unmesha Sreeveni U.B

Junior Developer

http:= //www.unmeshasreeveni.blogspot.in/







--
=
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer

http:= //www.unmeshasreeveni.blogspot.in/





--
=
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer

http:= //www.unmeshasreeveni.blogspot.in/





--
=
Thanks & Regards

Unmesha Sreeveni U.B
Junior Developer

http:= //www.unmeshasreeveni.blogspot.in/


--047d7bd6bd76bc911004f027d885--