Return-Path: X-Original-To: apmail-apex-dev-archive@minotaur.apache.org Delivered-To: apmail-apex-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DB131195FE for ; Wed, 23 Mar 2016 04:15:30 +0000 (UTC) Received: (qmail 41047 invoked by uid 500); 23 Mar 2016 04:15:30 -0000 Delivered-To: apmail-apex-dev-archive@apex.apache.org Received: (qmail 40979 invoked by uid 500); 23 Mar 2016 04:15:30 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 40967 invoked by uid 99); 23 Mar 2016 04:15:30 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2016 04:15:30 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 20CF41A14F0 for ; Wed, 23 Mar 2016 04:15:30 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -4.021 X-Spam-Level: X-Spam-Status: No, score=-4.021 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 7t4rCtkFh0Q6 for ; Wed, 23 Mar 2016 04:15:28 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with SMTP id D44975FB5E for ; Wed, 23 Mar 2016 04:15:26 +0000 (UTC) Received: (qmail 40962 invoked by uid 99); 23 Mar 2016 04:15:26 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Mar 2016 04:15:26 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id CCEAA2C14F8 for ; Wed, 23 Mar 2016 04:15:25 +0000 (UTC) Date: Wed, 23 Mar 2016 04:15:25 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: dev@apex.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (APEXMALHAR-2015) Projection Operator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/APEXMALHAR-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15207845#comment-15207845 ] ASF GitHub Bot commented on APEXMALHAR-2015: -------------------------------------------- Github user pradeepdalvi commented on a diff in the pull request: https://github.com/apache/incubator-apex-malhar/pull/217#discussion_r57107850 --- Diff: library/src/main/java/com/datatorrent/lib/projection/ProjectionOperator.java --- @@ -0,0 +1,303 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package com.datatorrent.lib.projection; + +import java.lang.reflect.Field; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.lang3.ClassUtils; + +import com.datatorrent.api.AutoMetric; +import com.datatorrent.api.Context; +import com.datatorrent.api.Context.PortContext; +import com.datatorrent.api.DefaultInputPort; +import com.datatorrent.api.DefaultOutputPort; + +import com.datatorrent.api.Operator; +import com.datatorrent.api.annotation.InputPortFieldAnnotation; +import com.datatorrent.api.annotation.OutputPortFieldAnnotation; + +import com.datatorrent.common.util.BaseOperator; + +import com.datatorrent.lib.util.PojoUtils; + +/** + * ProjectionOperator + * Projection Operator projects defined set of fields from given selectFields/dropFields + * + * Parameters + * - selectFields: comma separated list of fields to be selected from input tuples + * - dropFields: comma separated list of fields to be dropped from input tuples + * selectFields and dropFields are optional and either of them shall be specified + * When both are not specified, all fields shall be projected to downstream operator + * + * Input Port takes POJOs as an input + * + * Output Ports + * - projected port emits POJOs with projected fields from input POJOs + * - remainder port, if connected, emits POJOs with remainder fields from input POJOs + * - error port emits input POJOs as is upon error situations + * + * Examples + * For {a, b, c} type of input tuples + * - when selectFields = "" and dropFields = "", projected port shall emit {a, b, c} + * - when selectFields = "b", projected port shall emit {b} and remainder port shall emit {a, c} + * - when dropFields = "b", projected port shall emit {a, c} and remainder port shall emit {b} + * + */ + +public class ProjectionOperator extends BaseOperator implements Operator.ActivationListener +{ + protected String selectFields; + protected String dropFields; + protected String condition; + + static class TypeInfo + { + String name; + Class type; + PojoUtils.Setter setter; + PojoUtils.Getter getter; + + public TypeInfo(String name, Class type) + { + this.name = name; + this.type = type; + } + + public String toString() + { + String s = new String("'name': " + name + " 'type': " + type); + return s; + } + } + + protected transient List projectedFields = new ArrayList<>(); + protected transient List remainderFields = new ArrayList<>(); + + @AutoMetric + protected long projectedTuples; + + @AutoMetric + protected long remainderTuples; + + @AutoMetric + protected long errorTuples; + + protected Class inClazz = null; + protected Class projectedClazz = null; + protected Class remainderClazz = null; + + @InputPortFieldAnnotation(schemaRequired = true) + public transient DefaultInputPort in = new DefaultInputPort() + { + public void setup(PortContext context) + { + inClazz = context.getValue(Context.PortContext.TUPLE_CLASS); + } + + @Override + public void process(Object t) + { + handleProjection(t); + } + }; + + @OutputPortFieldAnnotation(schemaRequired = true) + public final transient DefaultOutputPort projected = new DefaultOutputPort() + { + public void setup(PortContext context) + { + projectedClazz = context.getValue(Context.PortContext.TUPLE_CLASS); + } + }; + + @OutputPortFieldAnnotation(schemaRequired = true) + public final transient DefaultOutputPort remainder = new DefaultOutputPort() + { + public void setup(PortContext context) + { + remainderClazz = context.getValue(Context.PortContext.TUPLE_CLASS); + } + }; + + + @OutputPortFieldAnnotation(schemaRequired = true) + public final transient DefaultOutputPort error = new DefaultOutputPort() + { + public void setup(PortContext context) + { + inClazz = context.getValue(Context.PortContext.TUPLE_CLASS); + } + }; + + /** + * addProjectedField: Add field details (name, type, getter and setter) for field with given name + * in projectedFields list + */ + protected void addProjectedField(String s) + { + try { + Field f = inClazz.getDeclaredField(s); + TypeInfo t = new TypeInfo(f.getName(), ClassUtils.primitiveToWrapper(f.getType())); + t.getter = PojoUtils.createGetter(inClazz, t.name, t.type); + logger.debug("Creating setter {} {} {}", projectedClazz, t.name, t.type); + t.setter = PojoUtils.createSetter(projectedClazz, t.name, t.type); + projectedFields.add(t); + } catch (NoSuchFieldException e) { + throw new RuntimeException("Field " + s + " not found in class " + inClazz, e); + } + } + + /** + * addRemainderField: Add field details (name, type, getter and setter) for field with given name + * in remainderFields list + */ + protected void addRemainderField(String s) + { + try { + Field f = inClazz.getDeclaredField(s); + TypeInfo t = new TypeInfo(f.getName(), ClassUtils.primitiveToWrapper(f.getType())); + t.getter = PojoUtils.createGetter(inClazz, t.name, t.type); + t.setter = PojoUtils.createSetter(remainderClazz, t.name, t.type); + remainderFields.add(t); + } catch (NoSuchFieldException e) { + throw new RuntimeException("Field " + s + " not found in class " + inClazz, e); + } + } + --- End diff -- projectedFields and remainderFields are separate and extending class might be interested only in one of them. Hence populating these lists have been kept differently. > Projection Operator > ------------------- > > Key: APEXMALHAR-2015 > URL: https://issues.apache.org/jira/browse/APEXMALHAR-2015 > Project: Apache Apex Malhar > Issue Type: New Feature > Reporter: Pradeep A Dalvi > > Projection Operator will allow apex users to project (select/drop) certain fields from the incoming tuples. This operation might be done unconditionally or based on certain condition. > Use case: > ------------- > Not all fields of tuples are of interest for the downstream operators. In such cases, one may want project selective fields to downstream. Also one may want to drop few fields, instead of selecting many. > In certain scenarios, one may want to project certain fields based on given condition or expression. > Functionality: > ----------------- > 1. Projection operator shall receive POJO as input tuple and emit 2 POJOs on separate output ports i.e. selected and dropped. Selected output port shall emit POJO with selected fields and dropped output shall emit POJO of dropped fields. > 2. Operator needs select or drop fields as input params. This shall be specified using comma separated list of fields. > 3. Operator shall emit POJO only on connected output ports. In another words, if dropped output port is not connected, it shall not even try to generate POJOs with dropped fields. -- This message was sent by Atlassian JIRA (v6.3.4#6332)