Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AC757174DC for ; Wed, 3 Jun 2015 21:41:26 +0000 (UTC) Received: (qmail 61779 invoked by uid 500); 3 Jun 2015 21:41:26 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 61712 invoked by uid 500); 3 Jun 2015 21:41:26 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 61702 invoked by uid 99); 3 Jun 2015 21:41:26 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 03 Jun 2015 21:41:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id D048ECB29F for ; Wed, 3 Jun 2015 21:41:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.901 X-Spam-Level: ** X-Spam-Status: No, score=2.901 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 94M9ubStTsBI for ; Wed, 3 Jun 2015 21:41:11 +0000 (UTC) Received: from mail-la0-f42.google.com (mail-la0-f42.google.com [209.85.215.42]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 501B8275E0 for ; Wed, 3 Jun 2015 21:41:11 +0000 (UTC) Received: by laew7 with SMTP id w7so18396640lae.1 for ; Wed, 03 Jun 2015 14:40:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=fKwUk9EdKAsLFnl/O8lUYvwxiJ8Lku37pAdBTY2VxKo=; b=RXAkN5bXB1MNxNurB3uilbRbNsk+2i7l1SuMUKTHYR+U71g0VJK5pKAwGfdgDEy5Df QCgChq343GF5NUQZyT2u6q7SE3zJixe+KzHA3ZzbG7ZDPt9SztH858eZ9g6A02tBlJpA lrROknSjUHvOvbNqNCOE4sCxDbZPWq1Ne7EgLMJvwiUHamh2QbVuCBS5pIQkma4hlM5U ERDqRda5Rrho7GrJGWJQFeEncwiFvQbo2OPhArvNmQEs9oce/z1NbQ1DfgZ4POafdNvF JxX9HLZq/RDLRZdrXYWnLG/xTJw87fhWy1Xo328rmRUHnIcoluDN7DiLXMIICJSU0cyE 7JVg== MIME-Version: 1.0 X-Received: by 10.152.30.4 with SMTP id o4mr32890482lah.74.1433367624536; Wed, 03 Jun 2015 14:40:24 -0700 (PDT) Received: by 10.152.225.171 with HTTP; Wed, 3 Jun 2015 14:40:24 -0700 (PDT) In-Reply-To: References: Date: Wed, 3 Jun 2015 23:40:24 +0200 Message-ID: Subject: Re: flink terasort From: Fabian Hueske To: user@flink.apache.org Content-Type: multipart/alternative; boundary=089e0160b43617da760517a3e8d3 --089e0160b43617da760517a3e8d3 Content-Type: text/plain; charset=UTF-8 A TeraSort implementation for the current DataSet API would look a bit different from the deprecated Record API. Flink doesn't support automatic range partitioning, but by using a custom partitoner (DataSet.partitionCustom()) which range partitions (distribution of values is known) and a subsequent DataSet.sortPartition() you can do a global sort and implement a TeraSort program. Just drop a mail if you have further questions. Cheers, Fabian 2015-06-03 17:34 GMT+02:00 Bill Sparks : > Will take a look, thanks. > -- > Jonathan (Bill) Sparks > Software Architecture > Cray Inc. > > From: Chiwan Park > Reply-To: "user@flink.apache.org" > Date: Wednesday, June 3, 2015 10:24 AM > To: "user@flink.apache.org" > Subject: Re: flink terasort > > There is a terasort implementation with deprecated API. > > https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/recordJobs/sort/TeraSort.java > > AFAIK, there is no implementation with current API. > > Regards, > Chiwan Park > > > > On Jun 4, 2015, at 12:17 AM, Bill Sparks wrote: > > Just asking, is there an implementation of terasort for flink? > > Regards, > Bill. > -- > Jonathan (Bill) Sparks > Software Architecture > Cray Inc. > > > --089e0160b43617da760517a3e8d3 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
A TeraSort implementation for the current DataSet API= would look a bit different from the deprecated Record API.
F= link doesn't support automatic range partitioning, but by using a custo= m partitoner (DataSet.partitionCustom()) which range partitions (distributi= on of values is known) and a subsequent DataSet.sortPartition() you can do = a global sort and implement a TeraSort program.

Ju= st drop a mail if you have further questions.

Cheers, Fab= ian

2015-06-03 17:34 GMT+02:00 Bill Sparks <jsparks@cray.com>:=
Will take a look, thanks.
--=C2=A0
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.

From: Chiwan Park <chiwanpark@icloud.com>=
Reply-To: "user@flink.apache.org" <= ;user@flink.apac= he.org>
Date: Wednesday, June 3, 2015 10:24= AM
To: "user@flink.apache.org" <user@flink.apache.org= >
Subject: Re: flink terasort

There is a terasort implementation with deprecated API.
https://github.com/apache/flink/blob/master/flink-tests/src/test/jav= a/org/apache/flink/test/recordJobs/sort/TeraSort.java

AFAIK, there is no implementation with current API.

Regards,
Chiwan Park



On Jun 4, 2015, at 12:17 AM, Bill Sparks <jsparks@cray.com> wrote:

Just asking, is there an implementation of terasort for flink?=C2=A0

Regards,
=C2=A0 =C2=A0Bill.
--=C2=A0
Jonathan (Bill) Sparks
Software Architecture
Cray Inc.


--089e0160b43617da760517a3e8d3--