Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 4AAAF200B78 for ; Fri, 2 Sep 2016 12:21:23 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 492D4160AAB; Fri, 2 Sep 2016 10:21:23 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 93CB3160AAC for ; Fri, 2 Sep 2016 12:21:22 +0200 (CEST) Received: (qmail 83036 invoked by uid 500); 2 Sep 2016 10:21:21 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 82631 invoked by uid 99); 2 Sep 2016 10:21:21 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Sep 2016 10:21:21 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 469052C1B7F for ; Fri, 2 Sep 2016 10:21:21 +0000 (UTC) Date: Fri, 2 Sep 2016 10:21:21 +0000 (UTC) From: "Matthias J. Sax (JIRA)" To: dev@kafka.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (KAFKA-4113) Allow KTable bootstrap MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 02 Sep 2016 10:21:23 -0000 [ https://issues.apache.org/jira/browse/KAFKA-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15458144#comment-15458144 ] Matthias J. Sax commented on KAFKA-4113: ---------------------------------------- Thanks for pointing out! I read it once. Good post! :) > Allow KTable bootstrap > ---------------------- > > Key: KAFKA-4113 > URL: https://issues.apache.org/jira/browse/KAFKA-4113 > Project: Kafka > Issue Type: Sub-task > Components: streams > Reporter: Matthias J. Sax > Assignee: Guozhang Wang > > On the mailing list, there are multiple request about the possibility to "fully populate" a KTable before actual stream processing start. > Even if it is somewhat difficult to define, when the initial populating phase should end, there are multiple possibilities: > The main idea is, that there is a rarely updated topic that contains the data. Only after this topic got read completely and the KTable is ready, the application should start processing. This would indicate, that on startup, the current partition sizes must be fetched and stored, and after KTable got populated up to those offsets, stream processing can start. > Other discussed ideas are: > 1) an initial fixed time period for populating > (it might be hard for a user to estimate the correct value) > 2) an "idle" period, ie, if no update to a KTable for a certain time is > done, we consider it as populated > 3) a timestamp cut off point, ie, all records with an older timestamp > belong to the initial populating phase > The API change is not decided yet, and the API desing is part of this JIRA. > One suggestion (for option (4)) was: > {noformat} > KTable table = builder.table("topic", 1000); // populate the table without reading any other topics until see one record with timestamp 1000. > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)