Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DF96B200D4B for ; Mon, 27 Nov 2017 15:24:32 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id DDC92160C13; Mon, 27 Nov 2017 14:24:32 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 309DB160C02 for ; Mon, 27 Nov 2017 15:24:32 +0100 (CET) Received: (qmail 26156 invoked by uid 500); 27 Nov 2017 14:24:31 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 26144 invoked by uid 99); 27 Nov 2017 14:24:30 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 27 Nov 2017 14:24:30 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 155EC1A1280 for ; Mon, 27 Nov 2017 14:24:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 2kJiJ10YhJAq for ; Mon, 27 Nov 2017 14:24:27 +0000 (UTC) Received: from mail-vk0-f53.google.com (mail-vk0-f53.google.com [209.85.213.53]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 5B6CA5F243 for ; Mon, 27 Nov 2017 14:24:27 +0000 (UTC) Received: by mail-vk0-f53.google.com with SMTP id 22so17467281vkq.4 for ; Mon, 27 Nov 2017 06:24:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=P2kqPuQM2ZpoacMyXpi5NKEQPjcKSHBDcGODQsm3wbM=; b=jmLJox8p/oXZMvjlQEz/eypnN90KwdqYnb5tC9Bjdd8AIYN7kteH99TwXirbBPxhOm MktUFZHgtoKpvb/4x9RIxWeat/hFqh8n9FACZImEvT0yzIppANuqN5TOYQlLHKDH5By/ in/l7gjSYzJjMUIcuhHaShtIY57nky4QiOZ6s35982cpF9kUh9Bso/93RBxPQoeF3xc3 naQ6RsOyOuHr1y41anMUEB6ZI9MWcCHgjZNSumu5elPNYFElz0SHlHxDPmt4Uv9o0ZT9 dV8WKLg96o9rGn4trAzCXB4YBAYSG27MdwFinqwyTyxUpD9hpTCagPGoSd98xPfPgLt9 Upfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=P2kqPuQM2ZpoacMyXpi5NKEQPjcKSHBDcGODQsm3wbM=; b=SfcRWezewYtgRg+SXVQnwU0cNPEq4TW0guP1LtQgs+XfKEvZ/2rSZxTuVfIwHKX2Ov 84mjtKCKxBaZ8AksX6izJ8MxjB5+4W62gGzWIwg4t9h70GMOI2xQUrTZXQeZR2MBW+QB zk9QVTqEb6Aobng7U5QB5ZidX8Q19XJBgA1L9OheATZRpcw7BnYn3DpQfHGx6QwDjFB6 VQEiQtY++gwezwaRpRA1lx4VeylR17nh11L759sPgI9bwZvDgabAncWARY+JuN4jHEmK 295hYEu/KINCtRZZ/o+AUvzSsN79MA13pch1P1O+iOmiMtwYvrE3Geh24NWPse4RDzID /3JQ== X-Gm-Message-State: AJaThX6h6GVsKWFpi9V9x23us6FH9Af6a0vxAxbkKlUs+POcXNJ4cK6g 6Ajr1kTJqkCE+4wVgDJxJnRTRQIwvpy3ybkazvBvDUDL X-Google-Smtp-Source: AGs4zMbUBLuoaSDKUG5p0XYbwLLLvJ++kTYmvlX0/dm8OLd5L/qrUnCsWI9PHN/BUX/U653xu/p18DLlBEu4qnmrxDo= X-Received: by 10.31.6.17 with SMTP id 17mr28309030vkg.192.1511792660489; Mon, 27 Nov 2017 06:24:20 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.106.70 with HTTP; Mon, 27 Nov 2017 06:23:40 -0800 (PST) In-Reply-To: References: From: Fabian Hueske Date: Mon, 27 Nov 2017 15:23:40 +0100 Message-ID: Subject: Re: [DISCUSS] FLIP-23 Model Serving To: "dev@flink.apache.org" Cc: Boris Lublinsky , Eron Wright , Roberto Bentivoglio , Riccardo Diomedi , Mauro Cortellazzi , "Geerdink, A.S. (Bas)" , Andrea Spina Content-Type: multipart/alternative; boundary="001a1143f24a80abbb055ef7a842" archived-at: Mon, 27 Nov 2017 14:24:33 -0000 --001a1143f24a80abbb055ef7a842 Content-Type: text/plain; charset="UTF-8" Hi Stavros, thanks for the detailed FLIP! Model serving is an important use case and it's great to see efforts to add a library for this to Flink! I've read the FLIP and would like to ask a few questions and make some suggestions. 1) Is it a strict requirement that a ML pipeline must be able to handle different input types? I understand that it makes sense to have different models for different instances of the same type, i.e., same data type but different keys. Hence, the key-based joins make sense to me. However, couldn't completely different types be handled by different ML pipelines or would there be major drawbacks? 2) I think from an API point of view it would be better to not require input records to be encoded as ProtoBuf messages. Instead, the model server could accept strongly-typed objects (Java/Scala) and (if necessary) convert them to ProtoBuf messages internally. In case we need to support different types of records (see my first point), we can introduce a Union type (i.e., an n-ary Either type). I see that we need some kind of binary encoding format for the models but maybe also this can be designed to be pluggable such that later other encodings can be added. 3) I think the DataStream Java API should be supported as a first class citizens for this library. 4) For the integration with the DataStream API, we could provide an API that receives (typed) DataStream objects, internally constructs the DataStream operators, and returns one (or more) result DataStreams. The benefit is that we don't need to change the DataStream API directly, but put a library on top. The other libraries (CEP, Table, Gelly) follow this approach. 5) I'm skeptical about using queryable state to expose metrics. Did you consider using Flink's metrics system [1]? It is easily configurable and we provided several reporters that export the metrics. What do you think? Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html 2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos : > Hi guys, > > Let's discuss the new FLIP proposal for model serving over Flink. The idea > is to combine previous efforts there and provide a library on top of Flink > for serving models. > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-23+-+Model+Serving > > Code from previous efforts can be found here: https://github.com/FlinkML > > Best, > Stavros > --001a1143f24a80abbb055ef7a842--