hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From unmesha sreeveni <unmeshab...@gmail.com>
Subject Re: Sorting a csv file
Date Fri, 17 Jan 2014 10:16:37 GMT
are we able to sort multiple columns dynamically as the user suggests?
ie user requests to sort col1 and col2
then the user request to sort 3 cols
I am not able to find anyof the stuff through googling


On Thu, Jan 16, 2014 at 4:03 PM, unmesha sreeveni <unmeshabiju@gmail.com>wrote:

> yes i did ..
> But how to make it in decending order?
>
> My current code run in accending order
>
> *public class SortingCsv {*
>  * public static class Map extends Mapper<LongWritable, Text, Text, Text>
> {*
> *    private Text word = new Text();*
> *    private Text one = new Text();*
>
> *    public void map(LongWritable key, Text value, Context context) throws
> IOException, InterruptedException {*
> *     System.out.println("in mapper");*
> *     /**
> *     * sort*
> *     */*
> *     ArrayList<String> ar = new ArrayList<String>(); *
> *     String line = value.toString();*
> *     String[] tokens = null;*
> *     ar.add(line);*
> *     System.out.println("list: "+ar);*
> *     for(int i=0;i<ar.size();i++) {*
> *            tokens=(ar.get(i)).split(",");*
> *            System.out.println("ele: "+ar.get(i));*
> *            System.out.println("token: "+tokens[1]); //change according
> to user input*
> *            word.set(tokens[1]);*
> *            one.set(ar.get(i));*
> *            context.write(word, one);*
> *         }*
> *    }*
> * } *
> * public static void main(String[] args) throws Exception {*
> * System.out.println("in main");*
> *    Configuration conf = new Configuration();*
>
> *        Job job = new Job(conf, "wordcount");*
> *        job.setJarByClass(SortingCsv.class);*
> *        //Path intermediateInfo = new Path("out");*
> *    job.setOutputKeyClass(Text.class);*
> *    job.setOutputValueClass(Text.class);*
>
> *    job.setMapperClass(Map.class);*
> *    FileSystem fs = FileSystem.get(conf);*
>
>  * /* Delete the files if any in the output path */*
>
>  * if (fs.exists(new Path(args[1])))*
> * fs.delete(new Path(args[1]), true);*
>
>
> *    job.setInputFormatClass(TextInputFormat.class);*
> *    job.setOutputFormatClass(TextOutputFormat.class);*
>
> *    FileInputFormat.addInputPath(job, new Path(args[0]));*
> *    FileOutputFormat.setOutputPath(job, new Path(args[1]));*
>
> *    job.waitForCompletion(true);*
> * }*
>
>
>
> On Thu, Jan 16, 2014 at 10:26 AM, unmesha sreeveni <unmeshabiju@gmail.com>wrote:
>
>> Thanks for ur reply Ramya
>> ok :) .so should i need to transpose the entire .csv file inorder to get
>> the entire col 2 data?
>>
>>
>> On Thu, Jan 16, 2014 at 10:11 AM, Ramya S <ramyas@suntecgroup.com> wrote:
>>
>>> Try to keep col2 values as  map output key  and map output value as the
>>> total values " b,a,v "
>>>
>>>
>>>
>>> Regards...
>>> Ramya.S
>>>
>>>
>>>
>>> ________________________________
>>>
>>> From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
>>> Sent: Thu 1/16/2014 9:29 AM
>>> To: User Hadoop
>>> Subject: Re: Sorting a csv file
>>>
>>>
>>> Thanks Ramya.s
>>> I was trying it to do with NULLWRITABLE..
>>>
>>> Thanks alot Ramya.
>>>
>>> And do u have any idea how to sort a given col.
>>> Say if user is giving col2 to sort the i want to get as
>>> b,a,v
>>> a,c,p
>>> d,a,z
>>> q,z,a
>>> r,a,b
>>>
>>> b,a,v
>>> d,a,z
>>> r,a,b
>>>
>>> a,c,p
>>>
>>> q,z,a
>>>
>>> How do i approch to that.
>>>
>>> I my current implementation i am getting
>>> result as
>>> a,c,p
>>> b,a,v
>>> d,a,z
>>> q,z,a
>>> r,a,b
>>>
>>>
>>> using the above code.
>>>
>>>
>>> On Wed, Jan 15, 2014 at 5:09 PM, Ramya S <ramyas@suntecgroup.com> wrote:
>>>
>>>
>>>         All you need is to change the map output value class to TEXT
>>> format.
>>>         Set this accordingly in the main.
>>>
>>>         Eg:
>>>
>>>         public static class Map extends Mapper<LongWritable, Text, Text,
>>> Text> {
>>>            private Text one = new Text("");
>>>
>>>            private Text word = new Text();
>>>
>>>            public void map(LongWritable key, Text value, Context
>>> context) throws IOException, InterruptedException {
>>>             System.out.println("in mapper");
>>>                String line = value.toString();
>>>                StringTokenizer tokenizer = new StringTokenizer(line);
>>>                while (tokenizer.hasMoreTokens()) {
>>>                    word.set(tokenizer.nextToken());
>>>                    context.write(word, one);
>>>                    System.out.println("sort: "+word);
>>>                }
>>>            }
>>>         }
>>>
>>>
>>>         Regards...?
>>>         Ramya.S
>>>
>>>
>>>         ________________________________
>>>
>>>         From: unmesha sreeveni [mailto:unmeshabiju@gmail.com]
>>>         Sent: Wed 1/15/2014 4:11 PM
>>>         To: User Hadoop
>>>         Subject: Re: Sorting a csv file
>>>
>>>
>>>
>>>         I did a map only job for sorting a txt file by editing wordcount
>>> program.
>>>         I only need the key .
>>>         How to set value to null.
>>>
>>>
>>>         public class SortingCsv {
>>>         public static class Map extends Mapper<LongWritable, Text, Text,
>>> IntWritable> {
>>>            private final static IntWritable one = new IntWritable(1);
>>>            private Text word = new Text();
>>>
>>>            public void map(LongWritable key, Text value, Context
>>> context) throws IOException, InterruptedException {
>>>             System.out.println("in mapper");
>>>                String line = value.toString();
>>>                StringTokenizer tokenizer = new StringTokenizer(line);
>>>                while (tokenizer.hasMoreTokens()) {
>>>                    word.set(tokenizer.nextToken());
>>>                    context.write(word, one);
>>>                    System.out.println("sort: "+word);
>>>                }
>>>            }
>>>         }
>>>         public static void main(String[] args) throws Exception {
>>>         System.out.println("in main");
>>>            Configuration conf = new Configuration();
>>>
>>>                Job job = new Job(conf, "wordcount");
>>>                job.setJarByClass(SortingCsv.class);
>>>                //Path intermediateInfo = new Path("out");
>>>            job.setOutputKeyClass(Text.class);
>>>            job.setOutputValueClass(IntWritable.class);
>>>
>>>            job.setMapperClass(Map.class);
>>>            FileSystem fs = FileSystem.get(conf);
>>>
>>>         /* Delete the files if any in the output path */
>>>
>>>         if (fs.exists(new Path(args[1])))
>>>         fs.delete(new Path(args[1]), true);
>>>
>>>
>>>            job.setInputFormatClass(TextInputFormat.class);
>>>            job.setOutputFormatClass(TextOutputFormat.class);
>>>
>>>            FileInputFormat.addInputPath(job, new Path(args[0]));
>>>            FileOutputFormat.setOutputPath(job, new Path(args[1]));
>>>
>>>            job.waitForCompletion(true);
>>>         }
>>>
>>>         }
>>>
>>>
>>>         On Wed, Jan 15, 2014 at 2:50 PM, unmesha sreeveni <
>>> unmeshabiju@gmail.com> wrote:
>>>
>>>
>>>                 How to sort a csv file
>>>                 I know , between map and reduce shuffle and sort is
>>> taking place.
>>>                 But how do i sort each column in a csv file?
>>>
>>>
>>>                 --
>>>
>>>                 Thanks & Regards
>>>
>>>
>>>                 Unmesha Sreeveni U.B
>>>
>>>                 Junior Developer
>>>
>>>                 http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>         --
>>>
>>>         Thanks & Regards
>>>
>>>
>>>         Unmesha Sreeveni U.B
>>>
>>>         Junior Developer
>>>
>>>         http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> --
>>>
>>> Thanks & Regards
>>>
>>>
>>> Unmesha Sreeveni U.B
>>>
>>> Junior Developer
>>>
>>> http://www.unmeshasreeveni.blogspot.in/
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> *Thanks & Regards*
>>
>> Unmesha Sreeveni U.B
>> Junior Developer
>>
>> http://www.unmeshasreeveni.blogspot.in/
>>
>>
>>
>
>
> --
> *Thanks & Regards*
>
> Unmesha Sreeveni U.B
> Junior Developer
>
> http://www.unmeshasreeveni.blogspot.in/
>
>
>


-- 
*Thanks & Regards*

Unmesha Sreeveni U.B
Junior Developer

http://www.unmeshasreeveni.blogspot.in/

Mime
View raw message