Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id C8E3F200AE1 for ; Mon, 6 Jun 2016 16:13:18 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id C783C160A24; Mon, 6 Jun 2016 14:13:18 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 4301B160A1E for ; Mon, 6 Jun 2016 16:13:18 +0200 (CEST) Received: (qmail 70124 invoked by uid 500); 6 Jun 2016 14:13:17 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 70115 invoked by uid 99); 6 Jun 2016 14:13:17 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 06 Jun 2016 14:13:17 +0000 Received: from mail-oi0-f43.google.com (mail-oi0-f43.google.com [209.85.218.43]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 22BF01A0146 for ; Mon, 6 Jun 2016 14:13:17 +0000 (UTC) Received: by mail-oi0-f43.google.com with SMTP id w184so227471800oiw.2 for ; Mon, 06 Jun 2016 07:13:17 -0700 (PDT) X-Gm-Message-State: ALyK8tIXxCSJZbPJbmZprXzlwKMfjWfjNgUTCRVDs3w4BSxWsblkFw20OGV8SsNqH/2eVoO7MdrrWaZTrV5hdl/8 X-Received: by 10.157.1.140 with SMTP id e12mr8988462ote.180.1465222396534; Mon, 06 Jun 2016 07:13:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.5.100 with HTTP; Mon, 6 Jun 2016 07:12:37 -0700 (PDT) In-Reply-To: References: From: Ufuk Celebi Date: Mon, 6 Jun 2016 16:12:37 +0200 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Custom keyBy(), look for similaties To: user@flink.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable archived-at: Mon, 06 Jun 2016 14:13:19 -0000 Hey I=C3=B1aki, you can use the KeySelector as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/common/ind= ex.html#specifying-keys But you only a local view for the current element, e.g. the library you use to determine the similarity has to know the similarities upfront. =E2=80=93 Ufuk On Mon, Jun 6, 2016 at 9:31 AM, i=C3=B1aki williams wrote: > Hi guys, > > I am using Flink on my project and I have a question. (I am using Java) > > Is it possible to modify the keyby method in order to key by similarities > and not by the exact name? > > Example: I recieve 2 DataStreams, in the first one , the name of the fiel= d > that I want to KeyBy is "John Locke", while in the Datastream the field > value is "John L". Can I use some java library to find for similarities > between strings and if the similitude is high, then key those elements > together.