From dev-return-106131-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Mon Jul 29 21:58:19 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 9F89018063F for ; Mon, 29 Jul 2019 23:58:18 +0200 (CEST) Received: (qmail 51839 invoked by uid 500); 29 Jul 2019 21:58:15 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 51827 invoked by uid 99); 29 Jul 2019 21:58:14 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 29 Jul 2019 21:58:14 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id C558D18057E for ; Mon, 29 Jul 2019 21:58:13 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.601 X-Spam-Level: X-Spam-Status: No, score=0.601 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=confluent.io Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id s9BsiFN5s6Uu for ; Mon, 29 Jul 2019 21:58:09 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.210.173; helo=mail-pf1-f173.google.com; envelope-from=matthias@confluent.io; receiver= Received: from mail-pf1-f173.google.com (mail-pf1-f173.google.com [209.85.210.173]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 3A110BC7E1 for ; Mon, 29 Jul 2019 21:58:09 +0000 (UTC) Received: by mail-pf1-f173.google.com with SMTP id f17so24700375pfn.6 for ; Mon, 29 Jul 2019 14:58:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=confluent.io; s=google; h=subject:to:references:from:openpgp:autocrypt:organization :message-id:date:user-agent:mime-version:in-reply-to; bh=JJWMj5IDGIIYEUeIPbYHVCfQHE3mjWVW4jo8tBMqcnA=; b=Tm0tOhetZdI1WJulDolyBCPpuXHAEUJ/AUaZlIHsXz86B37fix7vJ9B7ha8Qw9mAHO 6TFomXESTPQOOa7F7154UwlRwOW8y/IBcxV6ZLZbfhKWREL3tTzX1LddNH1Bk9p/UTUI CSLZipJA8X93uoVUtlyXS9CFNLxuk73V2opF9YikvhTlQ9mIkNX/xCsvsshQ1OVoU9+T PQtLS740MJI2y8JjrwLqerQ7cGYtH8J0xCngFLUYejSwwBJzpcRK6J9rXg58ro5J3pLn V3A9c8nEakTKGLAWGLCIRtg4YaeUfuP2bZrsy94sNtpYfUE+U8uWS3yBeY5JRN4Jm5Xu 8Nww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:openpgp:autocrypt :organization:message-id:date:user-agent:mime-version:in-reply-to; bh=JJWMj5IDGIIYEUeIPbYHVCfQHE3mjWVW4jo8tBMqcnA=; b=nBqorzi7CZPonPDNUY0AtUuKn6MV78QSmBNa71MT4V0EEQXO+vcjvotaIrAqvzswjC 0jwMOnH760F2tJZCTTZLYC5zBWNPBRL7b8QWhgQx7iHBniEtJWhCX4RjKUcoUyKyt8Qr FbVgMxPMye7KQvY0mHO8n7RVl4chJQ87VuIWKmf6dwIfS+y6BLMEEybWm4Rd1RUBQtuc zhVC7v7PX6hl5sh2B3TbvmINGK6wkJjf2lj4CT4HjUoW6H60/8J7Utq1fXr7D8e2GpZq DdxUD2LDT8aEnksIvTm8xaR7InkeHMwlNMZKtBQZH2t/BsqIBt78H86h85ORN2IPZ5XG qWfQ== X-Gm-Message-State: APjAAAXAd0W5ECTakVeR/VZdGUTf8tRlQ/jn0dGK8EuS4spHZprou8Tu /3IBMYrqkBGEtj0f6JWnbTMUrO0MvahiMtaR0+nVawxNvQfllERBmBQ9t5C4XDFe/nLo/npGUsz Dd4s/0l9T98rwz7y7IfK5fzMbGoNydOGKMuDbU1N5q2OgQp318AFAtU8Xho7MCKnlEu4= X-Google-Smtp-Source: APXvYqzcVvKDKpf8NeSyT1vB9EXVrF/iz4mC+zN4imyX8NRJSkmYGv215w+OSqsHHXQLnrAvrUX7cw== X-Received: by 2002:a65:6546:: with SMTP id a6mr52584141pgw.220.1564437487107; Mon, 29 Jul 2019 14:58:07 -0700 (PDT) Received: from Matthias-Sax-Macbook-Pro.local (50-0-2-20.static.sonic.net. [50.0.2.20]) by smtp.gmail.com with ESMTPSA id a5sm54246723pjv.21.2019.07.29.14.58.05 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 29 Jul 2019 14:58:05 -0700 (PDT) Subject: Re: [DISCUSS] KIP-478 Strongly Typed Processor API To: dev@kafka.apache.org References: <9cb27a2b-703b-388d-fbaa-bb9a7b6cc6fb@confluent.io> From: "Matthias J. Sax" Openpgp: preference=signencrypt Autocrypt: addr=matthias@confluent.io; prefer-encrypt=mutual; keydata= mQINBFcGWisBEAD4+gj1tJcLJRckkbZjdJpd1347/Zwndn8R6r2X+YYS5EgwzP5OQHl3Q6jl hAISoqBEfDeTJffsxd1wWL+6wKU4Y7zCkH/3aL/7znOlfaewpgJP3x/naawgvnJ0jlPlJtev MlAbG+9P6aEVxYfML59KBtRKzd6OZbSh0VzCJVCvkslv+LZqR94lhA0rArupqe7EO9DuP4/V bvnDxx1dZFtEK4n4wJYsRkF+TuxGClLcfosfM0oHTZeolus+rJTi7wxrbcTOlTmOMW0Wf9rK AobXsSz838RJenQqe3X0s5EBKCoIdI2SCQiTfcJ2JTVt6Ip1IDuEVqMQmtz7i2l3Rlml0GDa gODehmeMczVIBeO0+cppzOEynjQlWLCbJ8XEjISMI87Ied6DGbEYKLnG4ucRjM//8syKI4T+ Z90Y060jhWxrvr2pGqPPaU3qvIXW1D1mchXE1ba4HOdKb6fA7j5NU47WA2YmWRDhfM4exE0Q mD3Jfjfjyuch2rGhT3twSWHk7v5zlINwOfTIfeXvShqxNzJRFf6MudFnFbgmMTo51LmPcXHz 3tUaRNoky/HRpSxU7h145SgltrKSmYgWUnG4Y3qySyiPVKfBUBi/e5dYTk4Y0NDWGhZxOXCs ZV0NQsuoqFD82LrglwECrcdHd2QaKnIX2eKB7j63dMsexFDjewARAQABtCdNYXR0aGlhcyBK LiBTYXggPG1hdHRoaWFzQGNvbmZsdWVudC5pbz6JAjgEEwECACIFAlcGWisCGwMGCwkIBwMC BhUIAgkKCwQWAgMBAh4BAheAAAoJEDwRZiHEirRPWcoP/if/UkALwOvkct/IufpQhJ9qmg/a +QrSkDbPFrkWA7r9aMnaX8C5xuFgFuekT2aMzYYEr3eQwNHeeeEEFUhJQnMsKrjMw55w8+J5 Zhzdg8OKepFT0BOovXzITZcyqBW5JdMjfwEkdztBmPHbvWCCjglSllCvNDorxF1FCYuiL+F3 aBx0SaKaXuNhA4GO2IBY68SQ04ueeSbKbnukyWdAYXObBPBvbcoJXeesX6QvANxSjCqPHRsc czAr1mADzPN58nRXrOYgeonhPROYRlhLyEJM9CnGby/GRb9WfrKFwQVVjpNT0dD9vOvobmEp o4//m6qeXh6xC/egW9vBl63fJNOb/A6T1JdFdU8VWpUofAZtPHE61Y11AmkL7EoY5eUhlL0i jCQ4+K4wERP5NGOBEHe6wfxWoQMnPwnj59N8GylCOMhaVaIYqqRuAssMWyliX8nAj0dO3Hbw EZmBMt/xAAZpXCJU3iDS1G0+fs5LIIBWJodBvjec3DpIlib3PGG6IKPfkwpTfyoStRaPAc+X 085/KhgI8hM1nCxzI/0iQSsuyrJikYcCiB+EukaxTy1TS7O3Ul7Sg6f5f2YXB3jh9rPgZf/C y8iIJDyY+zfJLaYPO3uMEtqW69SWVeLM2viy5pj3MtEzDgC16OLINRdIsibPXRchdv7KN2FZ w4uvgTnZuQINBFypIPoBEACu4ZoR57pUDJZ3UuQR/UQRetv/gZyVwhmx7zL2oA2ZLWn0GwWK ruMRdqFRh2dv0eml7W57GK2RJsvNS2hzPHDteHLgOXl5Hhg+mrP00A2srifiBN9sG1PM5tAd VwGllQcR786IiE90NP2g8C3ThCSNZCFmtVi5hmgoMoad/szOFGN7mCiSzoXETFiwmBFUfrRu Q955IMdWuS47WNBQmh631nRDrrk3hOL3NB6PPIvYHooBrTF9N0hL1qa0t4Eu06z3uiA1o0A1 cpTP7WjHyrc6wq54xTZGYX7r1c1MWQobiUg7I8W16WbZZdrKH1ZPXkHoS7w7uMyhjF5atXl+ APDtZ6fMljJ3JF/9rBgAGgSJGZj8qwDnrrA2CwXIvelztS96wJYg2WXmkH+wrTLAyZEZIpH5 vAAXeIQLV4/Fmx4RZzYVHwzuTfO4jg8Im1C22XgQ0sqbrKMzxa13ivKNBy4oRgxhxsU0iXqP wzjm1uW/t6LA61TGrwW50kNZ3rW3yG60msdRejnO5HC7kuLK9GV6cJArEJT01RRXDEmqqpG9 N7Jmit6gnCP16TPFB+QvKXab+AAwIeG6E9GcSdUAOkdD5PWv+v6XJrRkMqbfEH0j6Hc84g0z 9kCFTmnK0/zjpEhTp3zzoknGGciX1MAfxXoL/5JnukQWH3FRUPbfri3ZzwARAQABiQRyBBgB CAAmFiEEV9+AVI/wFUXaxShcPBFmIcSKtE8FAlypIPoCGwIFCQHhM4ACQAkQPBFmIcSKtE/B dCAEGQEIAB0WIQTyiy7YJwIIXl2i4ZC7w8Foa7nDUQUCXKkg+gAKCRC7w8Foa7nDUVnwD/40 R713CkSP0easwIIOyR3/oVUuTeDgtaBATYXczXLjXSYjpo6ErokhKn8cwSPhaXRGHkBMUk6b 97dQimN5p2+/PdolP8hFRvq3V9ZdcGIiK9Iq5kJlpbAV31r9Vhv7zq2wiym0LXrlwSFf+6Mn RWSIcU2Si1ySwMjXDaO7YlIj7sC/DVnlhGcPAvNgKyEt8ZZYYfWXLyrkhOzfgXInyo71XDQQ hu/8mSQwiFFyVlyw8QIWtrEPJmYHBYoBNp17qprlF/3WIiSaNDGN7Cf4Ft35XyMRLJWKsVSH FYdLJ2xxgrD58NnHXpw5ViWTHRg01CDLMjo21UaIKuLcPxsflSWkpbPULhO3Xpoiee1UxECe YBAQIRSXBdQx33SYiBdDYjBaoe/PmX3nbYf+xKEc2NOW7ClqaX/r8xh0cDiYuzOvJkvjVR6X KCi7VFcXxk7cVV65arnDFyryJ60EpNpf7BzeJxY6Lf4UfGrUNl6C//3XyKa/8eUj7RJlYROt zmSGboRNy0SJXQxThPHAMkbElEZrDe2sp4fsjlFmBDdxHojIyZnwySOrMXYPz9E4lXrVlWhs WLB9/sWWYjKWE0pq5Th6ddLLZM+blCMK6iFyQQXy83uqm9YR+eY0SGakJ1v4IVFfmO9PwqFU juMuk0Dl7IItv9pdUfYjsniPxr6Dz27TwI3NEAC5v7VvBJyilh+Honck5FbR9AUyka6D9wxb zsHWUKSQYRRXn8BfHGC+I9SLhZ+HbEwVG7069LOKz+nwOEdamZ4LDpyeRJRV9W7al2aNR66O yj+/4NG5aduk/WC8bKI6k2YkPu94abMEKHZPLXGMhbjQMlINnQaEcPBXG5FmsfPEIQs2XuMr GehVc8IQI1/o9Xn+4z69fgiXHyIEr6f+gfuSyGOaw/7gczDxxln0zUEZ2G5LUSv/RD2VaX39 PhRy7hoy9Uw4g9ZSs8OsQrjXjGTp9gX409WgDXXIss53jlCPoB59VGOsKWDBd22caG5PsdAY pLiE6JTOOoH3WHyzEshEiFbyQ1nIdFIdGzeNU+r05rtKT249SgX4fw8drym9c07PgjtSRgVa YgifOCOj9vCYgwh7Xny1B+TRX+qK3l3xrB/U6lyftZ9wdY3o2azT8uyKqG2V9nt1lFPmvJTP 2418Dglmqvpioq13zYbOUiC//k6JmSwFZR4SqAmYdsX2dh3DcIGV8CZrP1KcBsnwBgmc6Jqk zmqQ1AnWGbIyn5Bq8Ga2GtaPSUncPW+3iYAaAUWbAXODczMMFfg0UmMisyzqkbrGGymqTgVM sKLgdF+HYVyXZpbOqd/GoUDVeqAphDpypH8LABpVkIjbCnkp5UmM/gE3RU9+OP9cxzk0U7og YrkCDQRcqSFDARAAoPkZcOKC+ajTguBOSfnKykxEM8ITChN98S79pMxj6mSFi+wE3YBi/24H FCLsYcNRhDQ0LhhVDU9/e7PoGmNTFI0Eqe4yvzDVOBnUMyzhudcT3S/pDfWDCaKg/18bBQ1a c9QUJEd9PS6llv1/KUUsfP3HJBFPiJI7j7MC0cpWOf1h+hgQyRhuDn8tKEPUdQ0wdqILo9qN 5dzGf/Ilh9vcnFWdmn3TDS5nC1wFyfbZKhbm+F6paZGIIMVDUg3jI136INmSuidaFv2jJhWv BcugZ0z44s36BSkrIk4jMYN6y4xXkYWDdsyaeMdPGVkWwGkMKhLdchQGIB7LY9S6FuRZWf6Q 1N6gpJcSNX/V7ICrmUNn1zSBUXiL5oh1bhwQi4qvFS2aMzhDjcf7/Qx0LJGlGmBwB4lvyfd0 pk65Kwaqc08uNhnjlJtsdnbWfiWtRW8Y2FEwn/VGST/PiqknBYrfNh84jXpRIFVM76pduxY7 vqI8y5BZyL0o/UQkso82T+K9+V7L5k48MYPoNUdZM2xvSJ3xBgZjQP9WRQZdWxH3wf7ERaM7 d+8Z4rO/cwE+A7D3gjmfchc3o7IBJrMpwM3hITTc1gRfxHN7Fh+qQNdp6k84AAnvaFh3CSzQ MA4/VL9tzRA/ib9eBHe2r3pMp23sshA0ALS5yXIkD7J60CAjLbMAEQEAAYkCPAQYAQgAJhYh BFffgFSP8BVF2sUoXDwRZiHEirRPBQJcqSFDAhsMBQkB4TOAAAoJEDwRZiHEirRPwBkQANNw HwEEnJJiaRwrTz1u9Ie4GJG9A0tpJ5rL/DSgCdNHTmtr28oP0bjqTWnxbhBT0aGXi8tM0QJ/ isVqKfaV2Fl+32xG3a9/Y8hMqtGErZjZQjHAhBLw3tjc2/1g8EOmsb+P3AkbWNWatcejcK9U OErJvo/EzUxHLktgFSgKf6rsx4Pv910mFEczcsf8Fq75lpqV8QH+/n/gJx4enlhLOoE9MrG0 RDMbMzJMN5IDqGU3Ae/Khcya8jh5h8rSl4Y+mdXbR5yD//YXI/VK70wkk8Fe6JdA4+7DY9/s d6fi1t46D9tveO0nSnvxOdKMfoRDIyLRmZgQqLgjiC/1pJ1ZyZHMUpSvuX6Fy1G1jeAiR9Vj PVSzRMT0XtHVh6zeOPp+qIrMkuPQXeItn/t4KoY9bSCzU5YGEjmlDwQyEhVUpdi6Fa3O1vrb d4+pkVefdrAePy/W7dYiWzJKPT4lOKAFc1a9uJ5+koDb+imQd7MjX7bWFz29ocsfuqvDnKIy 08Oq3k0IhF/lGDQCekWJ6Tap3mtOrvo0m9Z3pwOhWrVdLnHAPWxXP5rJ3Pdame0pn8AFRG40 0XOX0Jn6YDBLXXynr79ZCqiSqs2VDLIP4kW3M4wYe7MsqvbGBTLyOcGnUEQMLT+mpaUJg+ac I5FRbGf3c/r2PPse2sMSKW6sZ1+MhAwK Organization: Confluent Inc Message-ID: <829b9156-3398-b4b6-1c3f-dc22de2709bb@confluent.io> Date: Mon, 29 Jul 2019 14:58:04 -0700 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="7oh609Nxx9xQT224zql1DI5BGI0ZA8cec" --7oh609Nxx9xQT224zql1DI5BGI0ZA8cec Content-Type: multipart/mixed; boundary="MYAUjQlMdrQJBgPcFSIXlhcMOvSfbE3yI"; protected-headers="v1" From: "Matthias J. Sax" To: dev@kafka.apache.org Message-ID: <829b9156-3398-b4b6-1c3f-dc22de2709bb@confluent.io> Subject: Re: [DISCUSS] KIP-478 Strongly Typed Processor API References: <7b63503f-50b5-78c6-2592-c602c1b0e25a@confluent.io> <9cb27a2b-703b-388d-fbaa-bb9a7b6cc6fb@confluent.io> In-Reply-To: --MYAUjQlMdrQJBgPcFSIXlhcMOvSfbE3yI Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Thanks for the details! Also talked to Guozhang about a potential upgrade path. This KIP seems not to put us into an bad position to provide a clean upgrade path if we change the `ProcessorContext` in the future. Thus, I think we can move forward. -Matthias On 7/24/19 3:32 PM, John Roesler wrote: > Hey again Matthias, >=20 > I think it might help to frame the evaluation of the Context question i= f we > have a "spitball" proposal for what change we might want to make to the= > context. >=20 > Currently, the ProcessorContext is referenced in the following public > interfaces: >=20 > org.apache.kafka.streams.errors.DeserializationExceptionHandler#handle > org.apache.kafka.streams.kstream.Transformer#init > org.apache.kafka.streams.kstream.ValueTransformer#init > org.apache.kafka.streams.kstream.ValueTransformerWithKey#init > org.apache.kafka.streams.processor.Processor#init > org.apache.kafka.streams.processor.StateStore#init >=20 > We can sub-divide the ProcessorContext into broad categories: > General Information: > * a handle on the config > * information about the execution context (what is the task id, the > application id, etc) > Record Information: > * extra information about the current record > Store Support: > * the ability to register state restore callbacks > Processor Support: > * the ability to schedule punctuations > * the ability to get registered state stores > * the ability to schedule punctuations > * the ability to forward records > * the ability to request commits >=20 > We could imagine slicing the Processor Context into four new component > interfaces, and making ProcessorContext just implement them. Then, we c= ould > mix-and-match those new component interfaces for use elsewhere. >=20 > E.g.,: > org.apache.kafka.streams.errors.DeserializationExceptionHandler#handle > * only gets the informational context >=20 > org.apache.kafka.streams.kstream.Transformer#init > org.apache.kafka.streams.kstream.ValueTransformer#init > org.apache.kafka.streams.kstream.ValueTransformerWithKey#init > * information context > * the ability to get registered state stores > Also > * the ability to schedule punctuations > * restricted ability to forward (only obeying the rules of the particul= ar > interface, for example) > Or maybe just: > * no ability to foraward > * the ability to schedule special punctuators that can return one or mo= re > keys or values when fired >=20 > org.apache.kafka.streams.processor.Processor#init > * all the contexts, except the ability to register state restore callba= cks >=20 > org.apache.kafka.streams.processor.StateStore#init > * information contexts > * the ability to register state restore callbacks > * maybe punctuations and forwards, could be discussed further >=20 >=20 > The operative question for us right now is whether there is a clean pat= h to > something like this from the current KIP, or whether we'd be forced to > deprecate an interface we are only just now adding. Note that the only > interfaces we're adding right now are : > * org.apache.kafka.streams.processor.api.Processor > * org.apache.kafka.streams.processor.api.ProcessorSupplier > And the only thing we need to make the above spitball proposal compatib= le > with these proposed interfaces is to deprecate the ability to register > state restore callbacks from the ProcessorContext. >=20 > Otherwise, we would at that time be able to propose new Transformer > interfaces that take (e.g.) TransformerContexts, likewise with > DeserializationExceptionHandler and StateStore. >=20 > In other words, I _think_ that we have a clean migration path to addres= s > the Context problem in follow-on work. But clearly my motivation may be= > corrupt. What do you think? >=20 > Thanks, > -John >=20 >=20 > On Wed, Jul 24, 2019 at 5:06 PM John Roesler wrote:= >=20 >> Hey Matthias, >> >> I agree, it's worth double-checking to make sure that the upgrade path= >> would be smooth. There's no point in putting ourselves in an awkward j= am. >> I'll look into it and report back. >> >> Regarding the global store logic, I confirmed that the "state update >> processor" shouldn't be forwarding anything, so we can safely bound it= s >> output type to `Void`. I've updated the KIP. >> >> Thanks, >> -John >> >> On Wed, Jul 24, 2019 at 3:08 PM Matthias J. Sax >> wrote: >> >>> If we don't fix the `ProcessorContext` now, how would an upgrade path= >>> look like? >>> >>> We woudl deprecate existing `init()` and add a new `init()`, and duri= ng >>> runtime need to call both? This sound rather error prone to me and mi= ght >>> be confusing to users? Hence, it might be beneficial to fix it right = now. >>> >>> If my concerns are not valid, and we think that the upgrade path will= >>> smooth, we can of course do a follow up KIP. Another possibility woul= d >>> be, to still do an extra KIP but ensure that both KIPs are contained = in >>> the same release. >>> >>> WDYT? >>> >>> >>> -Matthias >>> >>> On 7/24/19 11:55 AM, John Roesler wrote: >>>> Hey Matthias, >>>> >>>> Thanks for the review! >>>> >>>> I agree about ProcessorContext, it could certainly be split up to >>> improve >>>> compile-time clues about what is or is not permitted (like, do you j= ust >>>> want to be able to see the extra record context vs. forawrding vs. >>>> registering state stores, as you said). But, similar to the ideas ar= ound >>>> transforms, we can hopefully make that a separate design effort outs= ide >>> of >>>> this KIP. Is that ok with you? >>>> >>>> Note that, unlike the current Processor API, KIP-478 proposes to >>> provide a >>>> default no-op implementation of init(), which means we can deprecate= it >>>> later and replace it with one taking a cleaner "context" abstraction= , as >>>> you proposed. >>>> >>>> It's just that the typing change as proposed is already a very large= >>> design >>>> and implementation scope. I fear that adding in new flavors of >>>> ProcessorContext would make is much harder to actually consider the >>> design, >>>> and certainly stretch out the implementation phase as well. >>>> >>>> Regarding the documentation of non-goals, that's very good feedback.= >>> I'll >>>> update the KIP. >>>> >>>> Regarding addGlobalStore... I'll look into it. >>>> >>>> Thanks! >>>> -John >>>> >>>> >>>> >>>> On Wed, Jul 24, 2019 at 12:27 PM Matthias J. Sax >>> >>>> wrote: >>>> >>>>> I have concerns about the latest proposal from Guozhang. However, a= s >>>>> John said it's beyond the scope of this KIP and thus, I don't go in= to >>>>> details. I agree thought, that the current "transformer APIs" are n= ot >>>>> ideal and could be improved. >>>>> >>>>> >>>>> An orthogonal though is that we should split the current >>>>> `ProcessorContext` into multiple interfaces. Atm, the context can b= e >>> use >>>>> to: >>>>> >>>>> - access metadata >>>>> - schedule punctuation >>>>> - get state stores >>>>> - register state stores >>>>> - forward output data >>>>> >>>>> (1) registering state stores is only required if one implements a >>> custom >>>>> store, but not for a regular `Processor` implementation -- hence, i= t's >>> a >>>>> leaking abstraction >>>>> >>>>> (2) for `ValueTransformer` and `flatValueTransformer` we don't want= to >>>>> allow forwarding key-value pairs, and hence need to throw an RTE fo= r >>>>> this case atm >>>>> >>>>> (3) Why do we expose `keySerde()`, `valueSerde()`, and `stateDir()`= >>>>> explicitly? We have already `appConfigs()` to allow users to access= the >>>>> configuration. >>>>> >>>>> Overall, it seems that `ProcessorContext` is rather convoluted. >>> Because, >>>>> we add a new `Processor` abstraction, it seems like a good opportun= ity >>>>> to improve the interface and to not pass `ProcessroContext` into th= e >>> new >>>>> `Processor#init()` method, but an improved interface. >>>>> >>>>> Thoughts? >>>>> >>>>> >>>>> >>>>> One more nits about the KIP: >>>>> >>>>> I think, we should clearly state, that this change does not provide= >>> type >>>>> safety for PAPI users. The following example would compile without = any >>>>> errors or warning, even if the types don't match: >>>>> >>>>>> Topology t =3D new Topology(); >>>>>> t.addSource("s", ...); >>>>>> t.addProcessor("p1", new ProcessorSupplier>>>> BarValue>()..., "s"); >>>>>> t.addProcessor("p2", new ProcessorSupplier>> KOut, >>>>> VOut>()..., "p1"); >>>>> >>>>> Just want to make sure users understand the impact/scope of the cha= nge, >>>>> especially what is _not_ achieved. >>>>> >>>>> >>>>> About `addGlobalStore()` -- should the return types be `Void` simil= ar >>> to >>>>> `KStream#process()`? >>>>> >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>>> On 7/24/19 9:11 AM, Guozhang Wang wrote: >>>>>> Sounds good to me, thanks John! >>>>>> >>>>>> >>>>>> Guozhang >>>>>> >>>>>> On Wed, Jul 24, 2019 at 7:40 AM John Roesler >>> wrote: >>>>>> >>>>>>> Hey Guozhang, >>>>>>> >>>>>>> Thanks for the thought! It sounds related to what I was thinking = in >>>>>>> https://issues.apache.org/jira/browse/KAFKA-8396 , but a little >>>>> "extra"... >>>>>>> >>>>>>> I proposed to eliminate ValueTransformer, but I believe you're >>> right; we >>>>>>> could eliminate Transformer also and just use Processor in the >>>>> transform() >>>>>>> methods. >>>>>>> >>>>>>> To your first bullet, regarding transform/flatTransform... I'd ar= gue >>>>> that >>>>>>> the difference isn't material and if we switch to just using >>>>>>> context.forward instead of returns, then we just need one and peo= ple >>> can >>>>>>> call forward as much as they want. It certainly warrants further >>>>>>> discussion, though... >>>>>>> >>>>>>> To the second point, yes, I'm thinking that we can eschew the >>>>>>> ValueTransformer and instead do something like ignore the forward= ed >>> key >>>>> or >>>>>>> check the key for serial identity, etc. >>>>>>> >>>>>>> The ultimate advantage of these ideas is that we reduce the numbe= r of >>>>>>> interface variants and we also give people just one way to pass >>> values >>>>>>> forward instead of two. >>>>>>> >>>>>>> Of course, it's beyond the scope of this KIP, but this KIP is a >>>>>>> precondition for these further improvements. >>>>>>> >>>>>>> I'm copying your comment onto the ticket for posterity. >>>>>>> >>>>>>> Thanks! >>>>>>> -John >>>>>>> >>>>>>> On Tue, Jul 23, 2019 at 5:38 PM Guozhang Wang >>>>> wrote: >>>>>>> >>>>>>>> Hi John, >>>>>>>> >>>>>>>> Just a wild thought about Transformer: now with the new >>> Processor>>>>>>> KOut, VIn, VOut>#init(ProcessorContext), do we still= >>> need a >>>>>>>> Transformer (and even ValueTransformer / ValueTransformerWithKey= )? >>>>>>>> >>>>>>>> What if: >>>>>>>> >>>>>>>> * We just make KStream#transform to get a ProcessorSupplier as w= ell, >>>>> and >>>>>>>> inside `process()` we check that at most one `context.forward()`= is >>>>>>> called, >>>>>>>> and then take it as the return value. >>>>>>>> * We would still use ValueTransformer for KStream#transformValue= , >>> or we >>>>>>> can >>>>>>>> also use a `ProcessorSupplier where we allow at most one >>>>>>>> `context.forward()` AND we ignore whatever passed in as key but = just >>>>> use >>>>>>>> the original key. >>>>>>>> >>>>>>>> >>>>>>>> Guozhang >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 16, 2019 at 9:03 AM John Roesler = >>>>> wrote: >>>>>>>> >>>>>>>>> Hi again, all, >>>>>>>>> >>>>>>>>> I have started the voting thread. Please cast your votes (or vo= ice >>>>>>>>> your objections)! The vote will remain open at least 72 hours. >>> Once it >>>>>>>>> closes, I can send the PR pretty quickly. >>>>>>>>> >>>>>>>>> Thanks for all you help ironing out the details on this feature= =2E >>>>>>>>> -John >>>>>>>>> >>>>>>>>> On Mon, Jul 15, 2019 at 5:09 PM John Roesler >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Hey all, >>>>>>>>>> >>>>>>>>>> It sounds like there's general agreement now on this KIP, so I= >>>>>>> updated >>>>>>>>>> the KIP to fit in with Guozhang's overall proposed package >>> structure. >>>>>>>>>> Specifically, the proposed name for the new Processor interfac= e is >>>>>>>>>> "org.apache.kafka.streams.processor.api.Processor". >>>>>>>>>> >>>>>>>>>> If there are no objections, then I plan to start the vote >>> tomorrow! >>>>>>>>>> >>>>>>>>>> Thanks, all, for your contributions. >>>>>>>>>> -John >>>>>>>>>> >>>>>>>>>> On Thu, Jul 11, 2019 at 1:50 PM Matthias J. Sax < >>>>>>> matthias@confluent.io >>>>>>>>> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>> Side remark: >>>>>>>>>>> >>>>>>>>>>>> Now that "flat transform" is a specific >>>>>>>>>>>>> part of the API it seems okay to steer folks in that direct= ion >>>>>>> (to >>>>>>>>> never >>>>>>>>>>>>> use context.process in a transformer), but it should be cal= led >>>>>>> out >>>>>>>>>>>>> explicitly in javadocs. Currently Transformer (which is us= ed >>>>>>> for >>>>>>>>> both >>>>>>>>>>>>> transform() and flatTransform() ) doesn't really call out t= he >>>>>>>>> ambiguity: >>>>>>>>>>> >>>>>>>>>>> Would you want to do a PR for address this? We are always eag= er >>> to >>>>>>>>>>> improve the JavaDocs! >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -Matthias >>>>>>>>>>> >>>>>>>>>>> On 7/7/19 11:26 AM, Paul Whalen wrote: >>>>>>>>>>>> First of all, +1 on the whole idea, my team has run into >>>>>>>> (admittedly >>>>>>>>> minor, >>>>>>>>>>>> but definitely annoying) issues because of the weaker typing= =2E >>>>>>>> We're >>>>>>>>> heavy >>>>>>>>>>>> users of the PAPI and have Processors that, while not hundre= ds >>> of >>>>>>>>> lines >>>>>>>>>>>> long, are certainly quite hefty and call context.forward() i= n >>>>>>> many >>>>>>>>> places. >>>>>>>>>>>> >>>>>>>>>>>> After reading the KIP and discussion a few times, I've convi= nced >>>>>>>>> myself >>>>>>>>>>>> that any initial concerns I had aren't really concerns at al= l >>>>>>>> (state >>>>>>>>> store >>>>>>>>>>>> types, for one). One thing I will mention: changing >>>>>>> *Transformer* >>>>>>>>> to have >>>>>>>>>>>> ProcessorContext gave me pause, because I have c= ode >>>>>>>> that >>>>>>>>> does >>>>>>>>>>>> context.forward in transformers. Now that "flat transform" = is a >>>>>>>>> specific >>>>>>>>>>>> part of the API it seems okay to steer folks in that directi= on >>>>>>> (to >>>>>>>>> never >>>>>>>>>>>> use context.process in a transformer), but it should be call= ed >>>>>>> out >>>>>>>>>>>> explicitly in javadocs. Currently Transformer (which is use= d >>> for >>>>>>>>> both >>>>>>>>>>>> transform() and flatTransform() ) doesn't really call out th= e >>>>>>>>> ambiguity: >>>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>> >>> https://github.com/apache/kafka/blob/ca641b3e2e48c14ff308181c77577540= 8f5f35f7/streams/src/main/java/org/apache/kafka/streams/kstream/Transform= er.java#L75-L77 >>>>>>>>> , >>>>>>>>>>>> and for migrating users (from before flatTransform) it could= be >>>>>>>>> confusing. >>>>>>>>>>>> >>>>>>>>>>>> Side note, I'd like to plug KIP-401 (there is a discussion >>> thread >>>>>>>>> and a >>>>>>>>>>>> voting thread) which also relates to using the PAPI. It see= ms >>>>>>> like >>>>>>>>> there >>>>>>>>>>>> is some interest and it is in a votable state with the major= ity >>>>>>> of >>>>>>>>>>>> implementation complete. >>>>>>>>>>>> >>>>>>>>>>>> Paul >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 28, 2019 at 2:02 PM Bill Bejeck >>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Sorry for coming late to the party. >>>>>>>>>>>>> >>>>>>>>>>>>> As for the naming I'm in favor of RecordProcessor as well. >>>>>>>>>>>>> >>>>>>>>>>>>> I agree that we should not take on doing all of the package= >>>>>>>>> movements as >>>>>>>>>>>>> part of this KIP, especially as John has pointed out, it wi= ll >>> be >>>>>>>> an >>>>>>>>>>>>> opportunity to discuss some clean-up on individual classes >>>>>>> which I >>>>>>>>> envision >>>>>>>>>>>>> becoming another somewhat involved process. >>>>>>>>>>>>> >>>>>>>>>>>>> For the end goal, if possible, here's what I propose. >>>>>>>>>>>>> >>>>>>>>>>>>> 1. We keep the scope of the KIP the same, *but we only >>>>>>>>> implement* *it in >>>>>>>>>>>>> phases* >>>>>>>>>>>>> 2. Phase one could include what Guozhang had proposed >>> earlier >>>>>>>>> namely >>>>>>>>>>>>> 1. > 1.a) modifying ProcessorContext only with the outpu= t >>>>>>> types >>>>>>>>> on >>>>>>>>>>>>> forward. >>>>>>>>>>>>> > 1.b) modifying Transformer signature to have generi= cs >>> of >>>>>>>>>>>>> ProcessorContext, >>>>>>>>>>>>> > and then lift the restricting of not using punctuat= e: >>> if >>>>>>>>> user did >>>>>>>>>>>>> not >>>>>>>>>>>>> > follow the enforced typing and just code without >>>>>>> generics, >>>>>>>>> they >>>>>>>>>>>>> will get >>>>>>>>>>>>> > warning at compile time and get run-time error if t= hey >>>>>>>>> forward >>>>>>>>>>>>> wrong-typed >>>>>>>>>>>>> > records, which I think would be acceptable. >>>>>>>>>>>>> 3. Then we could tackle other pieces in an incremental >>> manner >>>>>>>> as >>>>>>>>> we see >>>>>>>>>>>>> what makes sense >>>>>>>>>>>>> >>>>>>>>>>>>> Just my 2cents >>>>>>>>>>>>> >>>>>>>>>>>>> -Bill >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jun 24, 2019 at 10:22 PM Guozhang Wang < >>>>>>>> wangguoz@gmail.com> >>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hi John, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Yeah I think we should not do all the repackaging as part = of >>>>>>> this >>>>>>>>> KIP as >>>>>>>>>>>>>> well (we can just do the movement of the Processor / >>>>>>>>> ProcessorSupplier), >>>>>>>>>>>>>> but I think we need to discuss the end goal here since >>>>>>> otherwise >>>>>>>>> we may >>>>>>>>>>>>> do >>>>>>>>>>>>>> the repackaging of Processor in this KIP, but only later o= n >>>>>>>>> realizing >>>>>>>>>>>>> that >>>>>>>>>>>>>> other re-packagings are not our favorite solutions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jun 24, 2019 at 3:06 PM John Roesler < >>>>>>> john@confluent.io> >>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hey Guozhang, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for the idea! I'm wondering if we could take a mid= dle >>>>>>>>> ground >>>>>>>>>>>>>>> and take your proposed layout as a "roadmap", while only >>>>>>>> actually >>>>>>>>>>>>>>> moving the classes that are already involved in this KIP.= >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The reason I ask is not just to control the scope of this= >>> KIP, >>>>>>>> but >>>>>>>>>>>>>>> also, I think that if we move other classes to new packag= es, >>>>>>> we >>>>>>>>> might >>>>>>>>>>>>>>> also want to take the opportunity to clean up other thing= s >>>>>>> about >>>>>>>>> them. >>>>>>>>>>>>>>> But each one of those would become a discussion point of = its >>>>>>>> own, >>>>>>>>> so >>>>>>>>>>>>>>> it seems the discussion would become intractable. FWIW, I= do >>>>>>>> like >>>>>>>>> your >>>>>>>>>>>>>>> idea for precisely this reason, it creates opportunities = for >>>>>>> us >>>>>>>> to >>>>>>>>>>>>>>> consider other changes that we are simply not able to mak= e >>>>>>>> without >>>>>>>>>>>>>>> breaking source compatibility. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If the others feel "kind of favorable" with this overall >>>>>>> vision, >>>>>>>>> maybe >>>>>>>>>>>>>>> we can make one or more Jira tickets to capture it, and t= hen >>>>>>>> just >>>>>>>>>>>>>>> alter _this_ proposal to `processor.api.Processor` (etc).= >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sun, Jun 23, 2019 at 7:17 PM Guozhang Wang < >>>>>>>> wangguoz@gmail.com >>>>>>>>>> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hello John, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for your detailed explanation, I've done some qui= ck >>>>>>>>> checks on >>>>>>>>>>>>>> some >>>>>>>>>>>>>>>> existing examples that heavily used Processor and the >>> results >>>>>>>>> also >>>>>>>>>>>>>> makes >>>>>>>>>>>>>>> me >>>>>>>>>>>>>>>> worried about my previous statements that "the breakage >>> would >>>>>>>>> not be >>>>>>>>>>>>>>> big". >>>>>>>>>>>>>>>> I agree we should maintain compatibility. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> About the naming itself, I'm actually a bit inclined int= o >>>>>>>>>>>>> sub-packages >>>>>>>>>>>>>>> than >>>>>>>>>>>>>>>> renamed new classes, and my motivations are that our cur= rent >>>>>>>>>>>>> packaging >>>>>>>>>>>>>> is >>>>>>>>>>>>>>>> already quite coarsen grained and sometimes ill-placed, = and >>>>>>>> hence >>>>>>>>>>>>> maybe >>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>> can take this change along with some clean up on package= s >>>>>>> (but >>>>>>>>> again, >>>>>>>>>>>>>> we >>>>>>>>>>>>>>>> should follow the deprecate - removal path). What I'm >>>>>>> thinking >>>>>>>>> is: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ------------------- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> processor/: >>>>>>>>> StateRestoreCallback/AbstractNotifyingRestoreCallback, >>>>>>>>>>>>>>> (deprecated >>>>>>>>>>>>>>>> later, same meaning for other cross-throughs), >>>>>>> ProcessContest, >>>>>>>>>>>>>>>> RecordContext, Punctuator, PunctuationType, To, Cancella= ble >>>>>>>> (are >>>>>>>>> the >>>>>>>>>>>>>> only >>>>>>>>>>>>>>>> things left) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) processor/api/: Processor, ProcessorSupplier (and = of >>>>>>>>> course, >>>>>>>>>>>>>> these >>>>>>>>>>>>>>>> two classes can be strong typed) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> state/: StateStore, BatchingStateRestoreCallback, >>>>>>>>>>>>>>>> AbstractNotifyingBatchingRestoreCallback (moved from >>>>>>>> processor/), >>>>>>>>>>>>>>>> PartitionGrouper, WindowStoreIterator, StateSerdes (this= one >>>>>>>> can >>>>>>>>> be >>>>>>>>>>>>>> moved >>>>>>>>>>>>>>>> into state/internals), TimestampedByteStore (we can move= >>> this >>>>>>>> to >>>>>>>>>>>>>>> internals >>>>>>>>>>>>>>>> since store types would use vat by default, see below), >>>>>>>>>>>>>> ValueAndTimestamp >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) state/factory/: Stores, StoreBuilder, StoreSupplie= r; >>>>>>>> *BUT* >>>>>>>>> the >>>>>>>>>>>>>> new >>>>>>>>>>>>>>>> Stores would not have timestampedXXBuilder APIs since th= e >>>>>>>> default >>>>>>>>>>>>>>>> StoreSupplier / StoreBuilder value types are >>>>>>> ValueAndTimestamp >>>>>>>>>>>>> already. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) state/queryable/: QueryableStoreType, >>>>>>>> QueryableStoreTypes, >>>>>>>>>>>>>> HostInfo >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) state/keyValue/: KeyValueXXX classes, and also the= >>> same >>>>>>>> for >>>>>>>>>>>>>>>> state/sessionWindow and state/timeWindow; *BUT* here we = use >>>>>>>>>>>>>>>> ValueAndTimestamp as value types of those APIs directly,= and >>>>>>>> also >>>>>>>>>>>>>>>> TimestampedKeyValue/WindowStore would be deprecated. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) kstream/api/: KStream, KTable, GroupedKStream (ren= amed >>>>>>>> from >>>>>>>>>>>>>>>> KGroupedStream), GroupedKTable (renamed from KGroupedTab= le), >>>>>>>>>>>>>>>> TimeWindowedKStream, SessionWindowedKStream, GlobalKTabl= e >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) kstream/operator/: Aggregator, ForeachFunction, .= =2E. , >>>>>>>>> Merger >>>>>>>>>>>>> and >>>>>>>>>>>>>>>> Grouped, Joined, Materialized, ... , Printed and >>> Transformer, >>>>>>>>>>>>>>>> TransformerSupplier. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) kstream/window/: Window, Windows, Windowed, >>>>>>> TimeWindows, >>>>>>>>>>>>>>>> SessionWindows, UnlimitedWindows, JoinWindows, >>>>>>> WindowedSerdes, >>>>>>>>>>>>>>>> Time/SessionWindowedSerialized/Deserializer. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) configure/: RocksDBConfigSetter, TopicNameExtracto= r, >>>>>>>>>>>>>>>> TimestampExtractor, UsePreviousTimeOnInvalidTimestamp, >>>>>>>>>>>>>>>> WallclockTimestampExtractor, ExtractRecordMetadataTimest= amp, >>>>>>>>>>>>>>>> FailOnInvalidTimestamp, LogAndSkipOnInvalidTimestamp, >>>>>>>>>>>>>>> StateRestoreListener, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (new) metadata/: StreamsMetadata, ThreadMetadata, >>>>>>> TaskMetadata, >>>>>>>>>>>>> TaskId >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Still, any xxx/internals packages are declared as inner >>>>>>>> classes, >>>>>>>>> but >>>>>>>>>>>>>>> other >>>>>>>>>>>>>>>> xxx/yyy packages are declared as public APIs. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> ------------------- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is a very wild thought and I can totally understand= if >>>>>>>>> people >>>>>>>>>>>>> feel >>>>>>>>>>>>>>>> this is too much since it definitely enlarges the scope = of >>>>>>> this >>>>>>>>> KIP a >>>>>>>>>>>>>> lot >>>>>>>>>>>>>>>> :) just trying to play a devil's advocate here to do maj= or >>>>>>>>>>>>> refactoring >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> avoid renaming Processor classes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Fri, Jun 21, 2019 at 9:51 PM Matthias J. Sax < >>>>>>>>>>>>> matthias@confluent.io >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think `RecordProcessor` is a good name. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On 6/21/19 5:09 PM, John Roesler wrote: >>>>>>>>>>>>>>>>>> After kicking the naming around a bit more, it seems l= ike >>>>>>> any >>>>>>>>>>>>>> package >>>>>>>>>>>>>>>>>> name change is a bit "weird" because it fragments the >>>>>>> package >>>>>>>>> and >>>>>>>>>>>>>>>>>> directory structure. If we can come up with a reasonab= le >>>>>>> name >>>>>>>>> for >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>> interface after all, it seems like the better choice. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The real challenge is that the existing name "Processo= r" >>>>>>>> seems >>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>> about perfect. In picking a new name, we need to consi= der >>>>>>> the >>>>>>>>>>>>>>> ultimate >>>>>>>>>>>>>>>>>> state, after the deprecation period, when we entirely >>>>>>> remove >>>>>>>>>>>>>>>>>> Processor. In this context, TypedProcessor seems a lit= tle >>>>>>> odd >>>>>>>>> to >>>>>>>>>>>>>> me, >>>>>>>>>>>>>>>>>> because it seems to imply that there should also be an= >>>>>>>> "untyped >>>>>>>>>>>>>>>>>> processor". >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> After kicking around a few other ideas, what does ever= yone >>>>>>>>> think >>>>>>>>>>>>>>> about >>>>>>>>>>>>>>>>>> "RecordProcessor"? I _think_ maybe it stands on its ow= n >>>>>>> just >>>>>>>>>>>>> fine, >>>>>>>>>>>>>>>>>> because it's a thing that processes... records? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> If others agree with this, I can change the proposal t= o >>>>>>>>>>>>>>> RecordProcessor. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Jun 21, 2019 at 6:42 PM John Roesler < >>>>>>>>> john@confluent.io> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> I've updated the KIP with the feedback so far. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The naming question is still the biggest (only?) >>>>>>> outstanding >>>>>>>>>>>>>> issue. >>>>>>>>>>>>>>> It >>>>>>>>>>>>>>>>>>> would be good to hear some more thoughts on it. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> As we stand now, there's one vote for changing the >>> package >>>>>>>>> name >>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> something like 'typedprocessor', one for changing the= >>>>>>>>> interface >>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> TypedProcessor (as in the PoC), and one for just chan= ging >>>>>>>> the >>>>>>>>>>>>>>>>>>> Processor interface in-place, breaking source >>>>>>> compatibility. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> How can we resolve this decision? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Thu, Jun 20, 2019 at 5:44 PM John Roesler < >>>>>>>>> john@confluent.io >>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks for the feedback, Guozhang and Matthias, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding motivation: I'll update the wiki. Briefly:= >>>>>>>>>>>>>>>>>>>> * Any processor can benefit. Imagine a pure user of = the >>>>>>>>>>>>>>> ProcessorAPI >>>>>>>>>>>>>>>>>>>> who has very complex processing logic. I have seen >>>>>>> several >>>>>>>>>>>>>>> processor >>>>>>>>>>>>>>>>>>>> implementation that are hundreds of lines long and c= all >>>>>>>>>>>>>>>>>>>> `context.forward` in many different locations and >>>>>>> branches. >>>>>>>>> In >>>>>>>>>>>>>>> such an >>>>>>>>>>>>>>>>>>>> implementation, it would be very easy to have a bug = in a >>>>>>>>> rarely >>>>>>>>>>>>>>> used >>>>>>>>>>>>>>>>>>>> branch that forwards the wrong kind of value. This w= ould >>>>>>>>>>>>>>> structurally >>>>>>>>>>>>>>>>>>>> prevent that from happening. >>>>>>>>>>>>>>>>>>>> * Also, anyone who heavily uses the ProcessorAPI wou= ld >>>>>>>> likely >>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>> developed helper methods to wire together processors= , >>>>>>> just >>>>>>>> as >>>>>>>>>>>>> we >>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>> in the DSL implementation. This change would enable = them >>>>>>> to >>>>>>>>>>>>>> ensure >>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>> compile time that they are actually wiring together >>>>>>>>> compatible >>>>>>>>>>>>>>> types. >>>>>>>>>>>>>>>>>>>> This was actually _my_ original motivation, since I >>> found >>>>>>>> it >>>>>>>>>>>>> very >>>>>>>>>>>>>>>>>>>> difficult and time consuming to follow the Streams D= SL >>>>>>>>> internal >>>>>>>>>>>>>>>>>>>> builders. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding breaking the source compatibility of >>>>>>> Processor: I >>>>>>>>>>>>> would >>>>>>>>>>>>>>>>>>>> _love_ to side-step the naming problem, but I really= >>>>>>> don't >>>>>>>>> know >>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>>>> it's excusable to break compatibility. I suspect tha= t >>> our >>>>>>>>>>>>> oldest >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> dearest friends are using the ProcessorAPI in some f= orm >>>>>>> or >>>>>>>>>>>>>> another, >>>>>>>>>>>>>>>>>>>> and all their source code would break. It sucks to h= ave >>>>>>> to >>>>>>>>>>>>>> create a >>>>>>>>>>>>>>>>>>>> whole new interface to get around this, but it feels= >>> like >>>>>>>> the >>>>>>>>>>>>>> right >>>>>>>>>>>>>>>>>>>> thing to do. Would be nice to get even more feedback= on >>>>>>>> this >>>>>>>>>>>>>> point, >>>>>>>>>>>>>>>>>>>> though. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding the types of stores, as I said in my respo= nse >>>>>>> to >>>>>>>>>>>>>> Sophie, >>>>>>>>>>>>>>>>>>>> it's not an issue. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding the change to StreamsBuilder, it doesn't p= in >>>>>>> the >>>>>>>>>>>>> types >>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>> any way, since all the types are bounded by Object o= nly, >>>>>>>> and >>>>>>>>>>>>>> there >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>> no extra constraints between arguments (each type is= >>> used >>>>>>>>> only >>>>>>>>>>>>>>> once in >>>>>>>>>>>>>>>>>>>> one argument). But maybe I missed the point you were= >>>>>>> asking >>>>>>>>>>>>>> about. >>>>>>>>>>>>>>>>>>>> Since the type takes generic paramters, we should al= low >>>>>>>> users >>>>>>>>>>>>> to >>>>>>>>>>>>>>> pass >>>>>>>>>>>>>>>>>>>> in parameterized arguments. Otherwise, they would _h= ave >>>>>>> to_ >>>>>>>>>>>>> give >>>>>>>>>>>>>>> us a >>>>>>>>>>>>>>>>>>>> raw type, and they would be forced to get a "rawtyes= " >>>>>>>> warning >>>>>>>>>>>>>> from >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> compiler. So, it's our obligation in any API that >>>>>>> accepts a >>>>>>>>>>>>>>>>>>>> parameterized-type parameter to allow people to actu= ally >>>>>>>>> pass a >>>>>>>>>>>>>>>>>>>> parameterized type, even if we don't actually use th= e >>>>>>>>>>>>> parameters. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> The naming question is a complex one, as I took pain= s to >>>>>>>>> detail >>>>>>>>>>>>>>>>>>>> previously. Please don't just pick out one minor poi= nt, >>>>>>>> call >>>>>>>>> it >>>>>>>>>>>>>>> weak, >>>>>>>>>>>>>>>>>>>> and then claim that it invalidates the whole decisio= n. I >>>>>>>>> don't >>>>>>>>>>>>>>> think >>>>>>>>>>>>>>>>>>>> there's a clear best choice, so I'm more than happy = for >>>>>>>>> someone >>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>> advocate for renaming the class instead of the packa= ge. >>>>>>> Can >>>>>>>>> you >>>>>>>>>>>>>>>>>>>> provide some reasons why you think that would be bet= ter? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Regarding the deprecated methods, you're absolutely >>>>>>> right. >>>>>>>>> I'll >>>>>>>>>>>>>>>> update the KIP. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks again for all the feedback! >>>>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Thu, Jun 20, 2019 at 4:34 PM Matthias J. Sax < >>>>>>>>>>>>>>> matthias@confluent.io> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Just want to second what Sophie said about the stor= es. >>>>>>> The >>>>>>>>>>>>> type >>>>>>>>>>>>>>> of a >>>>>>>>>>>>>>>>>>>>> used stores is completely independent of input/outp= ut >>>>>>>> types. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> This related to change `addGlobalStore()` method. W= hy >>> do >>>>>>>> you >>>>>>>>>>>>>> want >>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>> pin >>>>>>>>>>>>>>>>>>>>> the types? In fact, people request the ability to >>>>>>> filter() >>>>>>>>> and >>>>>>>>>>>>>>> maybe >>>>>>>>>>>>>>>>>>>>> even map() the data before they are put into the gl= obal >>>>>>>>> store. >>>>>>>>>>>>>>>> Limiting >>>>>>>>>>>>>>>>>>>>> the types seems to be a step backward here? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Also, the pack name is questionable. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> This wouldn't be the first project to do something= >>> like >>>>>>>>>>>>> this... >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Not a strong argument. I would actually propose to = not >>>>>>> a a >>>>>>>>> new >>>>>>>>>>>>>>>> package, >>>>>>>>>>>>>>>>>>>>> but just a new class `TypedProcessor`. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> For `ProcessorContext#forward` methods -- some of t= hose >>>>>>>>>>>>> methods >>>>>>>>>>>>>>> are >>>>>>>>>>>>>>>>>>>>> already deprecated. While the will still be affecte= d, >>> it >>>>>>>>> would >>>>>>>>>>>>>> be >>>>>>>>>>>>>>>> worth >>>>>>>>>>>>>>>>>>>>> to mark them as deprecated in the wiki page, too. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> @Guozhang: I dont' think we should break source >>>>>>>>> compatibility >>>>>>>>>>>>>> in a >>>>>>>>>>>>>>>> minor >>>>>>>>>>>>>>>>>>>>> release. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Matthias >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On 6/20/19 1:43 PM, Guozhang Wang wrote: >>>>>>>>>>>>>>>>>>>>>> Hi John, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks for KIP! I've a few comments below: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. So far the "Motivation" section is very general= , >>> and >>>>>>>> the >>>>>>>>>>>>>> only >>>>>>>>>>>>>>>> concrete >>>>>>>>>>>>>>>>>>>>>> example that I have in mind is >>>>>>>> `TransformValues#punctuate`. >>>>>>>>>>>>> Do >>>>>>>>>>>>>> we >>>>>>>>>>>>>>>> have any >>>>>>>>>>>>>>>>>>>>>> other concrete issues that drive this KIP? If not = then >>>>>>> I >>>>>>>>> feel >>>>>>>>>>>>>>>> better to >>>>>>>>>>>>>>>>>>>>>> narrow the scope of this KIP to: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1.a) modifying ProcessorContext only with the outp= ut >>>>>>>> types >>>>>>>>> on >>>>>>>>>>>>>>>> forward. >>>>>>>>>>>>>>>>>>>>>> 1.b) modifying Transformer signature to have gener= ics >>>>>>> of >>>>>>>>>>>>>>>> ProcessorContext, >>>>>>>>>>>>>>>>>>>>>> and then lift the restricting of not using punctua= te: >>>>>>> if >>>>>>>>> user >>>>>>>>>>>>>> did >>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>> follow the enforced typing and just code without >>>>>>>> generics, >>>>>>>>>>>>> they >>>>>>>>>>>>>>>> will get >>>>>>>>>>>>>>>>>>>>>> warning at compile time and get run-time error if = they >>>>>>>>>>>>> forward >>>>>>>>>>>>>>>> wrong-typed >>>>>>>>>>>>>>>>>>>>>> records, which I think would be acceptable. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I feel this would be a good solution for this spec= ific >>>>>>>>> issue; >>>>>>>>>>>>>>>> again, feel >>>>>>>>>>>>>>>>>>>>>> free to update the wiki page with other known issu= es >>>>>>> that >>>>>>>>>>>>>> cannot >>>>>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>> resolved. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2. If, we want to go with the current scope then m= y >>>>>>> next >>>>>>>>>>>>>> question >>>>>>>>>>>>>>>> would be, >>>>>>>>>>>>>>>>>>>>>> how much breakage we would introducing if we just >>>>>>> modify >>>>>>>>> the >>>>>>>>>>>>>>>> Processor >>>>>>>>>>>>>>>>>>>>>> signature directly? My feeling is that DSL users w= ould >>>>>>> be >>>>>>>>>>>>> most >>>>>>>>>>>>>>>> likely not >>>>>>>>>>>>>>>>>>>>>> affected and PAPI users only need to modify a few >>> lines >>>>>>>> on >>>>>>>>>>>>>> class >>>>>>>>>>>>>>>>>>>>>> declaration. I feel it worth doing some research o= n >>>>>>> this >>>>>>>>> part >>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> then >>>>>>>>>>>>>>>>>>>>>> decide if we really want to bite the bullet of >>>>>>> duplicated >>>>>>>>>>>>>>> Processor >>>>>>>>>>>>>>>> / >>>>>>>>>>>>>>>>>>>>>> ProcessorSupplier classes for maintaining >>>>>>> compatibility. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Guozhang >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Wed, Jun 19, 2019 at 12:21 PM John Roesler < >>>>>>>>>>>>>> john@confluent.io >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> In response to the feedback so far, I changed the= >>>>>>>> package >>>>>>>>>>>>> name >>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>>> `processor2` to `processor.generic`. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 4:49 PM John Roesler < >>>>>>>>>>>>>> john@confluent.io >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the feedback, Sophie! >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I actually felt a little uneasy when I wrote tha= t >>>>>>>> remark, >>>>>>>>>>>>>>> because >>>>>>>>>>>>>>>> it's >>>>>>>>>>>>>>>>>>>>>>>> not restricted at all in the API, it's just >>> available >>>>>>>> to >>>>>>>>>>>>> you >>>>>>>>>>>>>> if >>>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>>>>>>>>>> choose to give your stores and context the same >>>>>>>>> parameters. >>>>>>>>>>>>>>> So, I >>>>>>>>>>>>>>>>>>>>>>>> think your use case is valid, and also perfectly= >>>>>>>>>>>>> permissable >>>>>>>>>>>>>>>> under the >>>>>>>>>>>>>>>>>>>>>>>> current KIP. Sorry for sowing confusion on my ow= n >>>>>>>>>>>>> discussion >>>>>>>>>>>>>>>> thread! >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I'm not crazy about the package name, either. I = went >>>>>>>> with >>>>>>>>>>>>> it >>>>>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>>>>>> because there's seemingly nothing special about = the >>>>>>> new >>>>>>>>>>>>>> package >>>>>>>>>>>>>>>> except >>>>>>>>>>>>>>>>>>>>>>>> that it can't have the same name as the old one.= >>>>>>>>> Otherwise, >>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> existing "processor" and "Processor" names for t= he >>>>>>>>> package >>>>>>>>>>>>>> and >>>>>>>>>>>>>>>> class >>>>>>>>>>>>>>>>>>>>>>>> are perfectly satisfying. Rather than pile on >>>>>>>> additional >>>>>>>>>>>>>>>> semantics, it >>>>>>>>>>>>>>>>>>>>>>>> seemed cleaner to just add a number to the packa= ge >>>>>>>> name. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> This wouldn't be the first project to do somethi= ng >>>>>>> like >>>>>>>>>>>>>> this... >>>>>>>>>>>>>>>> Apache >>>>>>>>>>>>>>>>>>>>>>>> Commons, for example, has added a "2" to the end= of >>>>>>>> some >>>>>>>>> of >>>>>>>>>>>>>>> their >>>>>>>>>>>>>>>>>>>>>>>> packages for exactly the same reason. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I'm open to any suggestions. For example, we cou= ld >>> do >>>>>>>>>>>>>> something >>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>>>> org.apache.kafka.streams.typedprocessor.Processo= r or >>>>>>>>>>>>>>>>>>>>>>>> org.apache.kafka.streams.processor.typed.Process= or , >>>>>>>>> which >>>>>>>>>>>>>>> would >>>>>>>>>>>>>>>> have >>>>>>>>>>>>>>>>>>>>>>>> just about the same effect. One microscopic thou= ght >>>>>>> is >>>>>>>>>>>>> that, >>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>>>>>>>> there's another interface in the "processor" pac= kage >>>>>>>> that >>>>>>>>>>>>> we >>>>>>>>>>>>>>> wish >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>> do the same thing to, would _could_ pile it in t= o >>>>>>>>>>>>>> "processor2", >>>>>>>>>>>>>>>> but we >>>>>>>>>>>>>>>>>>>>>>>> couldn't do the same if we use a package that ha= s >>>>>>>> "typed" >>>>>>>>>>>>> in >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>> name, >>>>>>>>>>>>>>>>>>>>>>>> unless that change is _also_ related to types in= >>> some >>>>>>>>> way. >>>>>>>>>>>>>> But >>>>>>>>>>>>>>>> this >>>>>>>>>>>>>>>>>>>>>>>> seems like a very minor concern. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> What's your preference? >>>>>>>>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 3:56 PM Sophie Blee-Gold= man >>> < >>>>>>>>>>>>>>>> sophie@confluent.io> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hey John, thanks for writing this up! I like th= e >>>>>>>>> proposal >>>>>>>>>>>>>> but >>>>>>>>>>>>>>>> there's >>>>>>>>>>>>>>>>>>>>>>> one >>>>>>>>>>>>>>>>>>>>>>>>> point that I think may be too restrictive: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> "A processor that happens to use a typed store = is >>>>>>>>> actually >>>>>>>>>>>>>>>> emitting the >>>>>>>>>>>>>>>>>>>>>>>>> same types that it is storing." >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I can imagine someone could want to leverage th= is >>>>>>> new >>>>>>>>> type >>>>>>>>>>>>>>> safety >>>>>>>>>>>>>>>>>>>>>>> without >>>>>>>>>>>>>>>>>>>>>>>>> also limiting how they can interact with/use th= eir >>>>>>>>> store. >>>>>>>>>>>>> As >>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>> (admittedly >>>>>>>>>>>>>>>>>>>>>>>>> contrived) example, say you have an input strea= m of >>>>>>>>>>>>>> purchases >>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>> certain >>>>>>>>>>>>>>>>>>>>>>>>> type (entertainment, food, etc), and on seeing = a >>> new >>>>>>>>>>>>> record >>>>>>>>>>>>>>> you >>>>>>>>>>>>>>>> want to >>>>>>>>>>>>>>>>>>>>>>>>> output how many types of purchase a shopper has= >>> made >>>>>>>>> more >>>>>>>>>>>>>>> than 5 >>>>>>>>>>>>>>>>>>>>>>> purchases >>>>>>>>>>>>>>>>>>>>>>>>> of in the last month. Your state store will >>> probably >>>>>>>> be >>>>>>>>>>>>>>> holding >>>>>>>>>>>>>>>> some >>>>>>>>>>>>>>>>>>>>>>> more >>>>>>>>>>>>>>>>>>>>>>>>> complicated PurchaseHistory object (keyed by us= er), >>>>>>>> but >>>>>>>>>>>>> your >>>>>>>>>>>>>>>> output is >>>>>>>>>>>>>>>>>>>>>>> just >>>>>>>>>>>>>>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I'm also not crazy about "processor2" as the >>> package >>>>>>>>> name >>>>>>>>>>>>>> ... >>>>>>>>>>>>>>>> not sure >>>>>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>>>>>>> a better one would be though (something with >>>>>>> "typed"?) >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Jun 17, 2019 at 12:47 PM John Roesler <= >>>>>>>>>>>>>>> john@confluent.io> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to propose KIP-478 ( >>>>>>>>>>>>>>>>>>>>>>> https://cwiki.apache.org/confluence/x/2SkLBw >>>>>>>>>>>>>>>>>>>>>>>>>> ). >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> This proposal would add output type bounds to = the >>>>>>>>>>>>> Processor >>>>>>>>>>>>>>>> interface >>>>>>>>>>>>>>>>>>>>>>>>>> in Kafka Streams, which enables static checkin= g of >>>>>>> a >>>>>>>>>>>>> number >>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>> useful >>>>>>>>>>>>>>>>>>>>>>>>>> properties: >>>>>>>>>>>>>>>>>>>>>>>>>> * A processor B that consumes the output of >>>>>>>> processor A >>>>>>>>>>>>> is >>>>>>>>>>>>>>>> actually >>>>>>>>>>>>>>>>>>>>>>>>>> expecting the same types that processor A >>> produces. >>>>>>>>>>>>>>>>>>>>>>>>>> * A processor that happens to use a typed stor= e is >>>>>>>>>>>>> actually >>>>>>>>>>>>>>>> emitting >>>>>>>>>>>>>>>>>>>>>>>>>> the same types that it is storing. >>>>>>>>>>>>>>>>>>>>>>>>>> * A processor is simply forwarding the expecte= d >>>>>>> types >>>>>>>>> in >>>>>>>>>>>>>> all >>>>>>>>>>>>>>>> code >>>>>>>>>>>>>>>>>>>>>>> paths. >>>>>>>>>>>>>>>>>>>>>>>>>> * Processors added via the Streams DSL, which = are >>>>>>> not >>>>>>>>>>>>>>> permitted >>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> forward results at all are statically prevente= d >>>>>>> from >>>>>>>>>>>>> doing >>>>>>>>>>>>>> so >>>>>>>>>>>>>>>> by the >>>>>>>>>>>>>>>>>>>>>>>>>> compiler >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Internally, we can use the above properties to= >>>>>>>> achieve >>>>>>>>> a >>>>>>>>>>>>>> much >>>>>>>>>>>>>>>> higher >>>>>>>>>>>>>>>>>>>>>>>>>> level of confidence in the Streams DSL >>>>>>>> implementation's >>>>>>>>>>>>>>>> correctness. >>>>>>>>>>>>>>>>>>>>>>>>>> Actually, while doing the POC, I found a few b= ugs >>>>>>> and >>>>>>>>>>>>>>> mistakes, >>>>>>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>>>>>>>>> become structurally impossible with KIP-478. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, the stronger types dramatically >>>>>>> improve >>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>> self-documentation of our Streams internal >>>>>>>>>>>>> implementations, >>>>>>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>>>>>>>>> makes it much easier for new contributors to r= amp >>>>>>> up >>>>>>>>> with >>>>>>>>>>>>>>>> confidence. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks so much for your consideration! >>>>>>>>>>>>>>>>>>>>>>>>>> -John >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> -- Guozhang >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> -- Guozhang >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> -- Guozhang >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >=20 --MYAUjQlMdrQJBgPcFSIXlhcMOvSfbE3yI-- --7oh609Nxx9xQT224zql1DI5BGI0ZA8cec Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Comment: GPGTools - https://gpgtools.org iQIzBAEBCgAdFiEE8osu2CcCCF5douGQu8PBaGu5w1EFAl0/a+wACgkQu8PBaGu5 w1EgVw/+OcLRVusG4Z2em2JIIcy/M8/3mXdn8Jyhzoxb5KWOob38O0eXWy1bcOti NPD++QIkVeAUEUMcjXmqHQdn/ELJtUt4FKsuMK3a3ssOGXImPh+NG40EaWRLWK5M Iufx4q4oRBTDXyrw0fX1Xi0r1KJWSfRp47gNqf9yYbEJHGB2fwgxneqIyEYThdyI /f0XsDhHQcg3/zKHKHhIDnO+1CRd1lNon9BCw8sVRRiw5FCof3LyEyTuhTlQNkVc LEx+8yW0rfXXdUbchOsLONhfz+WzNIiqk6aCJ4jFGcv81Q6mVEB2YJX4lfAk2ved ltYWhpIokjoMXRNROlup2XxNTNOu4wcCVuL/4myIEo+e/gtXw67l0cDEp3cHxfJs JWIf7FV6tiIbBAlEOzheOTcYEJ5nQGzWpFAmzWnf1EBqq3gOFIVeCA+u6mbMF9LY HiyodaIJo9WSqaU1kOFCSsto3t5Kt4ZbZ5Wny5X351NdqrKSo7g377N/2zlPIcao F/Xn4AHpEkMydUW+wIglnRZ7vBGBxzsph39tM2AghA/81TYvC7HQkCFbe2aGWtC3 oE6csSKGnmQYRjD9tItgZMUq0hxZeo8tb5gRsaTyGks2s9Kb8a85UtVkhPj2waKh EcNrUiexLN1jRPmaQMdLRnkZRsfdkLnoxRtya0Ludfdd3zBLUdE= =V9Kx -----END PGP SIGNATURE----- --7oh609Nxx9xQT224zql1DI5BGI0ZA8cec--