Return-Path: X-Original-To: apmail-flink-user-archive@minotaur.apache.org Delivered-To: apmail-flink-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2CB91177D8 for ; Sun, 23 Nov 2014 11:14:58 +0000 (UTC) Received: (qmail 63654 invoked by uid 500); 23 Nov 2014 11:14:57 -0000 Delivered-To: apmail-flink-user-archive@flink.apache.org Received: (qmail 63594 invoked by uid 500); 23 Nov 2014 11:14:57 -0000 Mailing-List: contact user-help@flink.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.incubator.apache.org Delivered-To: mailing list user@flink.incubator.apache.org Received: (qmail 63584 invoked by uid 99); 23 Nov 2014 11:14:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Nov 2014 11:14:57 +0000 X-ASF-Spam-Status: No, hits=2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of stefan.bunk@googlemail.com designates 209.85.220.181 as permitted sender) Received: from [209.85.220.181] (HELO mail-vc0-f181.google.com) (209.85.220.181) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 23 Nov 2014 11:14:53 +0000 Received: by mail-vc0-f181.google.com with SMTP id le20so3426470vcb.40 for ; Sun, 23 Nov 2014 03:14:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=km3kbSNRXiNWY5BOZhedTLZMXCGwDFwxZS1d517g1y4=; b=pxawItpfTEsmcwylOQnrwP1c9yKVryEPHb9GeUzjZ1ELLRiap59mrk3+Vf8QkC5jYS W1cfd3fbedKidjfyIlofbGJeg0RB1SND4+xVRGjsHjaSZGqhcJhmLSp0RlgElx2KS2p9 +sZxODPqpRlrFKJdoTf/rcOSkwmzj3eY01zmk6yDq+hhB+ay5ZfPwxiBYpNEH6YwQvJ3 waZ9xPIzcHhxGUaqKdYm4UweJ1Uel5p/I6O+zv6WiWC7ok78E1qQXX4MNZNfpCw2TTQR vJjnfssVIvJn/zysttkl9J04vvXegntRDo8iicun6JOTpCs3NbcImgYZDrfv/C6XzTaF cgSg== MIME-Version: 1.0 X-Received: by 10.220.174.193 with SMTP id u1mr9598101vcz.28.1416741272763; Sun, 23 Nov 2014 03:14:32 -0800 (PST) Received: by 10.52.106.136 with HTTP; Sun, 23 Nov 2014 03:14:32 -0800 (PST) In-Reply-To: References: <5470DF69.6000203@apache.org> Date: Sun, 23 Nov 2014 12:14:32 +0100 Message-ID: Subject: Re: Counting the number of elements in a dataset From: Stefan Bunk To: user@flink.incubator.apache.org Content-Type: multipart/alternative; boundary=089e01538db24d488a050884c836 X-Virus-Checked: Checked by ClamAV on apache.org --089e01538db24d488a050884c836 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable If it's about the verbosity, you can just use iter.size instead of your self-written count, right? val numVertices =3D (srcVertices union targetVertices).distinct.reduceGroup { iter =3D> iter.size } Performance-wise, this is the same, though. Cheers Stefan On Sat, Nov 22, 2014 at 8:17 PM, M=C3=A1rton Balassi wrote: > Hey, > > There was a thread recently on the dev list that might be interesting to > you [1]. > I do not know the exact state of the code though. > > [1] > http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/= Hi-Aggregation-support-td2311.html#a2547 > > Cheers, > > Marton > > On Sat, Nov 22, 2014 at 8:09 PM, Sebastian Schelter > wrote: > >> Hi, >> >> Is there a simple way to count the number of elements of a dataset? At >> the moment, I have to use the following code, which is pretty verbose an= d >> unefficient. >> >> val numVertices =3D >> (srcVertices union targetVertices).distinct.reduceGroup { iter =3D= > >> var count =3D 1L >> while (iter.hasNext) { >> count +=3D 1 >> iter.next >> } >> count >> } >> >> Best, >> Sebastian >> > > --089e01538db24d488a050884c836 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
If it's about the verbosity, you can just use iter.siz= e instead of your self-written count, right?

val numVertices = =3D
=C2=A0 =C2=A0 (srcVertices union targetVertices).distin= ct.reduceGroup { iter =3D>=C2=A0iter.size=C2=A0}

Performance-wise, this is the same, t= hough.

Cheers
Stefan

On Sat,= Nov 22, 2014 at 8:17 PM, M=C3=A1rton Balassi <balassi.marton@gmail= .com> wrote:
Hey,

There was a thread recently on the dev list that= might be interesting to you [1].
I do not know the exact state o= f the code though.


Cheers,

Ma= rton

On Sat, Nov 22, 2014 at 8:09 PM, Sebastian Schelter <ssc@apa= che.org> wrote:
Hi,

Is there a simple way to count the number of elements of a dataset? At the = moment, I have to use the following code, which is pretty verbose and uneff= icient.

=C2=A0 =C2=A0 val numVertices =3D
=C2=A0 =C2=A0 =C2=A0 (srcVertices union targetVertices).distinct.red= uceGroup { iter =3D>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 var count =3D 1L
=C2=A0 =C2=A0 =C2=A0 =C2=A0 while (iter.hasNext) {
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 count +=3D 1
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 iter.next
=C2=A0 =C2=A0 =C2=A0 =C2=A0 }
=C2=A0 =C2=A0 =C2=A0 =C2=A0 count
=C2=A0 =C2=A0 =C2=A0 }

Best,
Sebastian


--089e01538db24d488a050884c836--