Return-Path: X-Original-To: apmail-apex-dev-archive@minotaur.apache.org Delivered-To: apmail-apex-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6DD641859B for ; Fri, 13 Nov 2015 19:21:47 +0000 (UTC) Received: (qmail 66151 invoked by uid 500); 13 Nov 2015 19:21:47 -0000 Delivered-To: apmail-apex-dev-archive@apex.apache.org Received: (qmail 66089 invoked by uid 500); 13 Nov 2015 19:21:47 -0000 Mailing-List: contact dev-help@apex.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@apex.incubator.apache.org Delivered-To: mailing list dev@apex.incubator.apache.org Received: (qmail 66077 invoked by uid 99); 13 Nov 2015 19:21:47 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Nov 2015 19:21:47 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id A85541A2549 for ; Fri, 13 Nov 2015 19:21:46 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 0.971 X-Spam-Level: X-Spam-Status: No, score=0.971 tagged_above=-999 required=6.31 tests=[KAM_LAZY_DOMAIN_SECURITY=1, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, T_RP_MATCHES_RCVD=-0.01, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id yS3BkXKwY-Oc for ; Fri, 13 Nov 2015 19:21:40 +0000 (UTC) Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with SMTP id 7B63F20BF1 for ; Fri, 13 Nov 2015 19:21:39 +0000 (UTC) Received: (qmail 65318 invoked by uid 99); 13 Nov 2015 19:21:38 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Nov 2015 19:21:38 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id 8D43FE0441; Fri, 13 Nov 2015 19:21:38 +0000 (UTC) From: chandnisingh To: dev@apex.incubator.apache.org Reply-To: dev@apex.incubator.apache.org References: In-Reply-To: Subject: [GitHub] incubator-apex-malhar pull request: MLHR-1812 Add anti-join operat... Content-Type: text/plain Message-Id: <20151113192138.8D43FE0441@git1-us-west.apache.org> Date: Fri, 13 Nov 2015 19:21:38 +0000 (UTC) Github user chandnisingh commented on a diff in the pull request: https://github.com/apache/incubator-apex-malhar/pull/51#discussion_r44822442 --- Diff: library/src/main/java/com/datatorrent/lib/streamquery/AntiJoinOperator.java --- @@ -0,0 +1,205 @@ +/** + * Copyright (C) 2015 DataTorrent, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package com.datatorrent.lib.streamquery; + +import com.datatorrent.api.Context.OperatorContext; +import com.datatorrent.api.DefaultInputPort; +import com.datatorrent.api.DefaultOutputPort; +import com.datatorrent.api.Operator; +import com.datatorrent.api.annotation.OperatorAnnotation; +import com.datatorrent.lib.streamquery.condition.Condition; +import com.datatorrent.lib.streamquery.index.Index; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.classification.InterfaceStability.Evolving; + +/** + * An implementation of Operator that reads table row data from two table data input ports.
+ *

+ * Operator anti-joins row on given condition and selected names, emits + * anti-joined result at output port. + *
+ * StateFull : Yes, Operator aggregates input over application window.
+ * Partitions : No, will yield wrong result(s).
+ *
+ * Ports :
+ * inport1 : Input port for table 1, expects HashMap<String, Object>
+ * inport2 : Input port for table 2, expects HashMap<String, Object>
+ * outport : Output anti-joined row port, emits HashMap<String, ArrayList<Object>>
+ *
+ * Properties : + * joinCondition : Join condition for table rows.
+ * table1Columns : Columns to be selected from table1.
+ * table2Columns : Columns to be selected from table2.
+ *
+ * + * @displayName Anti join + * @category Stream Manipulators + * @tags sql, anti join operator + * @since 0.3.3 + */ +@OperatorAnnotation(partitionable = false) +@Evolving +public class AntiJoinOperator implements Operator +{ + + /** + * Join Condition; + */ + protected Condition joinCondition; + + /** + * Table1 select columns. + * Note: only left table (Table1) will be output in an Anti-join + */ + private ArrayList table1Columns = new ArrayList(); + + /** + * Collect data rows from input port 1. + */ + protected ArrayList> table1; --- End diff -- This field is exposed to extensions. This implies that later if we choose another implementation(LinkedList or ConcurrentList) then it will break backward compatibility. However if we declare the type to be 'List' then implementation can be changed without breaking compatibility. If you think the list implementation could change in the future, then I will recommend to declare the type as List. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---