Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 51938200C80 for ; Thu, 20 Apr 2017 00:46:11 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 4E7E5160B9C; Wed, 19 Apr 2017 22:46:11 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 9F6BF160BAA for ; Thu, 20 Apr 2017 00:46:10 +0200 (CEST) Received: (qmail 41848 invoked by uid 500); 19 Apr 2017 22:46:09 -0000 Mailing-List: contact commits-help@beam.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@beam.apache.org Delivered-To: mailing list commits@beam.apache.org Received: (qmail 41633 invoked by uid 99); 19 Apr 2017 22:46:09 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Apr 2017 22:46:09 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 14ED6CD770 for ; Wed, 19 Apr 2017 22:46:09 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id U_SHifuDubwp for ; Wed, 19 Apr 2017 22:46:07 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id A40DB5FDF5 for ; Wed, 19 Apr 2017 22:46:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 7E35DE0D87 for ; Wed, 19 Apr 2017 22:46:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 96D752408D for ; Wed, 19 Apr 2017 22:46:04 +0000 (UTC) Date: Wed, 19 Apr 2017 22:46:04 +0000 (UTC) From: "Kenneth Knowles (JIRA)" To: commits@beam.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (BEAM-1283) DoFn finishBundle should be required to specify the window for output MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 19 Apr 2017 22:46:11 -0000 [ https://issues.apache.org/jira/browse/BEAM-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kenneth Knowles updated BEAM-1283: ---------------------------------- Summary: DoFn finishBundle should be required to specify the window for output (was: DoFn finishBundle should be required to specify the window) > DoFn finishBundle should be required to specify the window for output > --------------------------------------------------------------------- > > Key: BEAM-1283 > URL: https://issues.apache.org/jira/browse/BEAM-1283 > Project: Beam > Issue Type: Bug > Components: beam-model, sdk-java-core, sdk-py > Reporter: Kenneth Knowles > Labels: backward-incompatible > Fix For: First stable release > > > The spec is here in Javadoc: https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/DoFn.java#L128 > "If invoked from {{@StartBundle}} or {{@FinishBundle}}, this will attempt to use the {{WindowFn}} of the input {{PCollection}} to determine what windows the element should be in, throwing an exception if the {{WindowFn}} attempts to access any information about the input element. The output element will have a timestamp of negative infinity." > This is a collection of caveats that make this method not always technically wrong, but quite a mess. Ideas that reasonable folks have suggested lately: > - The {{WindowFn}} cannot actually be applied because {{WindowFn}} is allowed to see the element type. The spec just avoids this by limiting which {{WindowFn}} can be used. > - There is no natural output timestamp, so it should always be provided. The spec avoids this by specifying an arbitrary and fairly useless timestamp. > - If it is a merging {{WindowFn}} like sessions that has already been merged then you'll just have a bogus proto window regardless of explicit timestamp or not. > The use cases for these methods are best addressed by state plus window expiry callback, so we should revisit this spec and probably just wipe it. > There are some rare case where you might need to output from {{FinishBundle}} in a way that is not _actually_ sensitive to bundling (perhaps modulo some downstream notion of equivalence) in which case you had better know what window you are outputting to. Often it should be the global window. -- This message was sent by Atlassian JIRA (v6.3.15#6346)