Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 1E1A8200BFE for ; Mon, 16 Jan 2017 11:50:43 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 1CB70160B30; Mon, 16 Jan 2017 10:50:43 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 65D60160B22 for ; Mon, 16 Jan 2017 11:50:42 +0100 (CET) Received: (qmail 89935 invoked by uid 500); 16 Jan 2017 10:50:41 -0000 Mailing-List: contact dev-help@apex.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.apache.org Delivered-To: mailing list dev@apex.apache.org Received: (qmail 89922 invoked by uid 99); 16 Jan 2017 10:50:41 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2017 10:50:41 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id DE9AD180031 for ; Mon, 16 Jan 2017 10:50:40 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.649 X-Spam-Level: ** X-Spam-Status: No, score=2.649 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd3-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id VMviaQbLy2PZ for ; Mon, 16 Jan 2017 10:50:39 +0000 (UTC) Received: from mail-vk0-f44.google.com (mail-vk0-f44.google.com [209.85.213.44]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 743ED5FB33 for ; Mon, 16 Jan 2017 10:50:39 +0000 (UTC) Received: by mail-vk0-f44.google.com with SMTP id x75so70763559vke.2 for ; Mon, 16 Jan 2017 02:50:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=OvX7pcPc5vTcEfXSZlryOYrnpl2tqxQyWCYbnKYBTkI=; b=Dpro8jlMGXMnyvqlSGCNpE4URsLLdW04WVLNEElOv63FTvuM5s4OevRYKxQLGm5+oR eIbqiQdfEp+eB+OKNwrnLqso47hdlsuGPefrYzC7l6hqcX3nQaeYpcUvCMV91Xkd4UTz a77Y9+x7YhJ4lmrnbFPRaj0+e0vwINwy651rW7m+Hmg9p2A+NvRgOIbp6QuZe9IBbjro o1ZnhT9ge8zvlyuCwMqoXgtI78OGuwKVf43KvYjl+3W14kXtEX3fpYBX2jYUuRao0MaH i89JJr1cVw1QcKG3QBXDryoywsSqb2eVuf7sCen+PJe8m8DPrSOKiQg5dUvXlhxqJC8n zy+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=OvX7pcPc5vTcEfXSZlryOYrnpl2tqxQyWCYbnKYBTkI=; b=dG6n7/Hr7Zz+ITFGKCWU+q1zsDUYl9UvAay/icr7Mgb3FuFtrmRqUKTRZPVcjbzccw 4O4aE/hHhaL3aBPK9c8bgnq3wbojyS0MsfIc2lyZOBeaE/I5pSAsNLX6EwLoSa2+2niw HBasRv6E/xKhsYg9+qtAuUa4k4k9naF7rvZiQVcbZ6SQ3LCIOSRJ6cN88Y3TnEtrMXjU EJQz04PyczNPaqlxhFyUdriEwEuOQCR7y4+hZfbjROmdJzRQhpixtSvmhO43quwzwg4n KSfSONZ50yIi675gW5S2I29r94vvOHqExfUuGDKAoRUzXg+lMYtRw7HNYwO6tPNup1rv ZZSQ== X-Gm-Message-State: AIkVDXK2rEGdoHnhlQlZeR1qYVPXF726Uw4k42cD3eOb0w0LZvoUwX52LRTvlLD0zqf8wjDCdjcM8L/39v2qUA== X-Received: by 10.31.201.7 with SMTP id z7mr13156536vkf.67.1484563838840; Mon, 16 Jan 2017 02:50:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.103.127.208 with HTTP; Mon, 16 Jan 2017 02:50:38 -0800 (PST) In-Reply-To: References: From: AJAY GUPTA Date: Mon, 16 Jan 2017 16:20:38 +0530 Message-ID: Subject: Re: Schema Discovery Support in Apex Applications To: dev@apex.apache.org Content-Type: multipart/alternative; boundary=001a114d660642b68f054633f45e archived-at: Mon, 16 Jan 2017 10:50:43 -0000 --001a114d660642b68f054633f45e Content-Type: text/plain; charset=UTF-8 +1 for the idea. I just had one question. As I understand, there will be some form of Anonymous POJO used as objects to pass information from one operator to another. Can you share how the user/operator developer would access the tuple object in case he wishes to do something with it? Ajay On Mon, Jan 16, 2017 at 2:53 PM, Chinmay Kolhatkar wrote: > Hi All, > > Currently a DAG that is generated by user, if contains any POJOfied > operators, TUPLE_CLASS attribute needs to be set on each and every port > which receives or sends a POJO. > > For e.g., if a DAG is like File -> Parser -> Transform -> Dedup -> > Formatter -> Kafka, then TUPLE_CLASS attribute needs to be set by user on > both input and output ports of transform, dedup operators and also on > parser output and formatter input. > > The proposal here is to reduce work that is required by user to configure > the DAG. Technically speaking if an operators knows input schema and > processing properties, it can determine output schema and convey it to > downstream operators. This way the complete pipeline can be configured > without user setting TUPLE_CLASS or even creating POJOs and adding them to > classpath. > > On the same idea, I want to propose an approach where the pipeline can be > configured without user setting TUPLE_CLASS or even creating POJOs and > adding them to classpath. > Here is the document which at a high level explains the idea and a high > level design: > https://docs.google.com/document/d/1ibLQ1KYCLTeufG7dLoHyN_ > tRQXEM3LR-7o_S0z_porQ/edit?usp=sharing > > I would like to get opinion from community about feasibility and > applications of this proposal. > Once we get some consensus we can discuss the design in details. > > Thanks, > Chinmay. > --001a114d660642b68f054633f45e--