Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 615AA200C63 for ; Thu, 11 May 2017 09:44:33 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 5E66D160BC7; Thu, 11 May 2017 07:44:33 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 792D8160BB2 for ; Thu, 11 May 2017 09:44:32 +0200 (CEST) Received: (qmail 31780 invoked by uid 500); 11 May 2017 07:44:31 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 31762 invoked by uid 99); 11 May 2017 07:44:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 May 2017 07:44:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 7D65ECDB21 for ; Thu, 11 May 2017 07:44:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.897 X-Spam-Level: X-Spam-Status: No, score=-0.897 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=openbet.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id zW__5jjvlbmd for ; Thu, 11 May 2017 07:44:26 +0000 (UTC) Received: from mail-wr0-f170.google.com (mail-wr0-f170.google.com [209.85.128.170]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id C92C35FAE1 for ; Thu, 11 May 2017 07:44:25 +0000 (UTC) Received: by mail-wr0-f170.google.com with SMTP id w50so13432493wrc.0 for ; Thu, 11 May 2017 00:44:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=openbet.com; s=google; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to; bh=M+N4gQvk2MRJTGSSfkkBpNWXRJ7Tr0e5u4+L3mWTSoI=; b=L/NGWEwB+Fb3mt+aCcgyWoTWzWcDSI7kT+HfJPH6yUy2BOokTSmSZChle2oKyVGFpm 20IbldhAZmDnNTH45P0fy5cPryoI/mIs7yDjmtqo1fjBjWvIE+gHBfimB7hNNZOXfip2 /Qd9cVm9vKNyKpV4fa7FWv6DSOpiJGRTOtnPg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to; bh=M+N4gQvk2MRJTGSSfkkBpNWXRJ7Tr0e5u4+L3mWTSoI=; b=kTvym6AftzJ9gxExQ39kD3aOWWo37B56S3I2CN53N81uPSQCGTX7rth43IyOrZYIaN VF2PH339Lvo6J9Cm+Z+74wxbKWJpdLVUl7Rb6pB5QvXlC+A0XybNH+XZnDHtOCMh1IwH euJ/mbLnVBxPCplb35OZVtp+vc6+QwO6eeLUw+rSEaW6gs7tYgYFXcXgeT14ZSnJ0xmQ 654ysvbEhIyreqQD9/uuSlEUlbJlBIGXB4dBO6ukwgEPhXrpJVO0B1c085Q+OOE5wytq 6PUjsIN7iEJW5xO70xfRURoOlgeJRHZjxh2d9FBLwUatkPZ56LLvIJwZH4YTjjh/FzpC CDww== X-Gm-Message-State: AODbwcC1iROpmfVhMDkysqNmVaErDpeMnWnNhR4TIVrGl02v3yJ5ZDyu +Sqpp6D9tbYBULNAHrSezUfLSuDszRTZq858SewJQkNFSrVycfB1/hYe5vAVllL+5E5Fp2RGHs7 GHPHB2OkcMbzdCY4lRe1BEhK3ss3R8J7slO1My9bbitIbhzuXmVBc8mjn3ASToKBlL1UL X-Received: by 10.223.155.2 with SMTP id b2mr6609310wrc.87.1494488665129; Thu, 11 May 2017 00:44:25 -0700 (PDT) Received: from [172.31.98.189] (bba451212.alshamil.net.ae. [83.110.88.108]) by smtp.gmail.com with ESMTPSA id o20sm885589wro.61.2017.05.11.00.44.23 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 May 2017 00:44:24 -0700 (PDT) Subject: Re: [VOTE] KIP-155 Add range scan for windowed state stores To: dev@kafka.apache.org References: <481d3b87-bcc3-472c-ac3a-fa44603735d7@openbet.com> From: Michal Borowiecki Message-ID: <058bad63-8d10-e1ce-541d-3b98be35c53f@openbet.com> Date: Thu, 11 May 2017 08:44:21 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <481d3b87-bcc3-472c-ac3a-fa44603735d7@openbet.com> Content-Type: multipart/alternative; boundary="------------E4A01C20B4AC898E21C45D88" archived-at: Thu, 11 May 2017 07:44:33 -0000 --------------E4A01C20B4AC898E21C45D88 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Also, wrt > In the case of the window store, the "key" of the single-key iterator is > the actual timestamp of the underlying entry, not just range of the > window, > so if we were to wrap the result key a window we wouldn't be getting back > the equivalent of the single key iterator. I believe the timestamp in the entry *is* the window start time (the end time is implicitly known by adding the window size to the window start time) https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamWindowReduce.java#L109 https://github.com/apache/kafka/blob/trunk/streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamWindowAggregate.java#L111 Both use window.start() as the timestamp when storing into the WindowStore. Or am I confusing something horribly here? Hope not ;-) If the above is correct, then using KeyValueIterator, V> as the return type of the new fetch method would indeed not lose anything the single key iterator is offering. The window end time could simply be calculated as window start time + window size (window size would have to be passed from the store supplier to the store implementation, which I think it isn't now but that's an implementation detail). If you take objection to exposing the window end time because the single key iterator doesn't do that, then an alternative could also be to have the return type of the new fetch be something like KeyValueItarator, V>, since the key is composed of the actual key and the timestamp together. peakNextKey() would then allow you to glimpse both the actual key and the associated window start time. This feels like a better workaround then putting the KeyValue pair in the V of the WindowStoreIterator. All-in-all, I'd still prefer KeyValueIterator, V> as it more clearly names what's what. What do you think? Thanks, Michal On 11/05/17 07:51, Michal Borowiecki wrote: > Well, another concern, apart from potential confusion, is that you > won't be able to peek the actual next key, just the timestamp. So the > tradeoff is between having consistency in return types versus > consistency in having the ability to know the next key without moving > the iterator. To me the latter just feels more important. > > Thanks, > Michal > On 11 May 2017 12:46 a.m., Xavier Léauté wrote: > > Thank you for the feedback Michal. > > While I agree the return may be a little bit more confusing to reason > about, the reason for doing so was to keep the range query interfaces > consistent with their single-key counterparts. > > In the case of the window store, the "key" of the single-key > iterator is > the actual timestamp of the underlying entry, not just range of > the window, > so if we were to wrap the result key a window we wouldn't be > getting back > the equivalent of the single key iterator. > > In both cases peekNextKey is just returning the timestamp of the > next entry > in the window store that matches the query. > > In the case of the session store, we already return Windowed > for the > single-key method, so it made sense there to also return > Windowed for > the range method. > > Hope this make sense? Let me know if you still have concerns about > this. > > Thank you, > Xavier > > On Wed, May 10, 2017 at 12:25 PM Michal Borowiecki < > michal.borowiecki@openbet.com> wrote: > > > Apologies, I missed the discussion (or lack thereof) about the > return > > type of: > > > > WindowStoreIterator> fetch(K from, K to, long > timeFrom, > > long timeTo) > > > > > > WindowStoreIterator (as the KIP mentions) is a subclass of > > KeyValueIterator > > > > KeyValueIterator has the following method: > > > > /** * Peek at the next key without advancing the iterator * > @return the > > key of the next value that would be returned from the next call > to next > > */ K peekNextKey(); > > > > Given the type in this case will be Long, I assume what it would > return > > is the window timestamp of the next found record? > > > > > > In the case of WindowStoreIterator fetch(K key, long > timeFrom, long > > timeTo); > > all records found by fetch have the same key, so it's harmless > to return > > the timestamp of the next found window but here we have varying > keys and > > varying windows, so won't it be too confusing? > > > > KeyValueIterator, V> (as in the proposed > > ReadOnlySessionStore.fetch) just feels much more intuitive. > > > > Apologies again for jumping onto this only once the voting has > already > > begun. > > Thanks, > > Michał > > > > On 10/05/17 20:08, Sriram Subramanian wrote: > > > +1 > > > > > > On Wed, May 10, 2017 at 11:42 AM, Bill Bejeck > wrote: > > > > > >> +1 > > >> > > >> Thanks, > > >> Bill > > >> > > >> On Wed, May 10, 2017 at 2:38 PM, Guozhang Wang > > > wrote: > > >> > > >>> +1. Thank you! > > >>> > > >>> On Wed, May 10, 2017 at 11:30 AM, Xavier Léauté > > > >>> wrote: > > >>> > > >>>> Hi everyone, > > >>>> > > >>>> Since there aren't any objections to this addition, I would > like to > > >> start > > >>>> the voting on KIP-155 so we can hopefully get this into 0.11. > > >>>> > > >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP+ > > >>>> 155+-+Add+range+scan+for+windowed+state+stores > > >>>> > > >>>> Voting will stay active for at least 72 hours. > > >>>> > > >>>> Thank you, > > >>>> Xavier > > >>>> > > >>> > > >>> > > >>> -- > > >>> -- Guozhang > > >>> > > > > > --------------E4A01C20B4AC898E21C45D88--