flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: why when use groupBy(2).sortGroup(0, Order.DESCENDING); not group by and not sort
Date Tue, 02 Jun 2015 00:23:40 GMT
You can also use sortPartition() to sort all partitions locally.
On Jun 2, 2015 02:11, "Chiwan Park" <chiwanpark@icloud.com> wrote:

> Hi. The sortGroup API returns a SortedGrouping object and but you don’t
> use the result. I think that you are confused with groupBy and sortGroup
> API. You should use this API such as following (I assumed you are using 0.8
> or 0.9-milestone-1):
>
> // select the first 10 data for each group.
> DataSet<Customer> sorted = customers.groupBy(2).sortGroup(0,
> Order.DESCENDING).first(10);
> System.out.println(sorted.print());
>
> Note that Flink does not support global sort (FLINK-598) but only support
> local sort currently. The sortGroup API means that sorting for each group.
>
>
> Regards,
> Chiwan Park
>
> > On Jun 2, 2015, at 5:02 AM, hagersaleh <loveallah1987@yahoo.com> wrote:
> >
> > why when use groupBy(2).sortGroup(0, Order.DESCENDING); not group by and
> not
> > sort
> >
> > I want sort DataSet How can I do that?
> >
> > customers = customers.filter(
> >            new FilterFunction<Customer>() {
> >                    @Override
> >                    public boolean filter(Customer c) {
> >
> >
> >                        return
> > Integer.parseInt(c.getField(0).toString())<=5 ;
> >
> >                    }
> >            });
> >
> >       customers.groupBy(2).sortGroup(0, Order.DESCENDING);
> >       System.out.println(customers.print());
> >       customers.writeAsCsv("/home/hadoop/Desktop/Dataset/output.csv",
> "\n",
> > "|");
> >       env.execute();
> >
> >
> > public static class Customer extends
> > Tuple5<Long,String,String,String,String> {
> >
> >       }
> >        private static DataSet<Customer>
> > getCustomerDataSet(ExecutionEnvironment env) {
> >               return
> env.readCsvFile("/home/hadoop/Desktop/Dataset/customer.csv")
> >                                       .fieldDelimiter('|')
> >
> > .includeFields("11100110").ignoreFirstLine()
> >                                        .tupleType(Customer.class);
> >       }
> >
> > the result not sort
> > 2> (1,Customer#000000001,IVhzIApeRb ot&&c&&E,711.56,BUILDING)
> > 2>
> (2,Customer#000000002,XSTf4&&NCwDVaWNe6tEgvwfmRchLXak,121.65,AUTOMOBILE)
> > 2> (3,Customer#000000003,MG9kdTD2WBHm,7498.12,AUTOMOBILE)
> > 2> (4,Customer#000000004,XxVSJsLAGtn,2866.83,MACHINERY)
> > 2> (5,Customer#000000005,KvpyuHCplrB84WgAiGV6sYpZq7Tj,794.47,HOUSEHOLD)
> >
> >
> >
> > --
> > View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/why-when-use-groupBy-2-sortGroup-0-Order-DESCENDING-not-group-by-and-not-sort-tp1436.html
> > Sent from the Apache Flink User Mailing List archive. mailing list
> archive at Nabble.com.
>
>
>
>
>
>

Mime
View raw message