Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A2090200C02 for ; Fri, 20 Jan 2017 14:09:36 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id A09B9160B48; Fri, 20 Jan 2017 13:09:36 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EBA78160B55 for ; Fri, 20 Jan 2017 14:09:35 +0100 (CET) Received: (qmail 45094 invoked by uid 500); 20 Jan 2017 13:09:34 -0000 Mailing-List: contact dev-help@pig.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@pig.apache.org Delivered-To: mailing list dev@pig.apache.org Received: (qmail 44983 invoked by uid 500); 20 Jan 2017 13:09:31 -0000 Delivered-To: apmail-hadoop-pig-dev@hadoop.apache.org Received: (qmail 44980 invoked by uid 99); 20 Jan 2017 13:09:31 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jan 2017 13:09:31 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 3969FC1D42 for ; Fri, 20 Jan 2017 13:09:31 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -1.199 X-Spam-Level: X-Spam-Status: No, score=-1.199 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, KAM_LAZY_DOMAIN_SECURITY=1, RP_MATCHES_RCVD=-2.999] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id GY_SSSKBV6Iz for ; Fri, 20 Jan 2017 13:09:30 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 083915FC3D for ; Fri, 20 Jan 2017 13:09:30 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9B33FE0284 for ; Fri, 20 Jan 2017 13:09:28 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 333E725291 for ; Fri, 20 Jan 2017 13:09:27 +0000 (UTC) Date: Fri, 20 Jan 2017 13:09:27 +0000 (UTC) From: "Nandor Kollar (JIRA)" To: pig-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (PIG-4891) Implement FR join by broadcasting small rdd not making more copys of data MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 20 Jan 2017 13:09:36 -0000 [ https://issues.apache.org/jira/browse/PIG-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nandor Kollar updated PIG-4891: ------------------------------- Attachment: (was: PIG-4891_1.patch) > Implement FR join by broadcasting small rdd not making more copys of data > ------------------------------------------------------------------------- > > Key: PIG-4891 > URL: https://issues.apache.org/jira/browse/PIG-4891 > Project: Pig > Issue Type: Sub-task > Components: spark > Reporter: liyunzhang_intel > Assignee: Nandor Kollar > Fix For: spark-branch > > > In current implementation of FRJoin(PIG-4771), we just set the value of replication of data as 10 to make the data access more efficiency because current FRJoin algrithms can be reused in this way. We need to figure out how to use broadcasting small rdd to implement FRJoin in current code base if we find the performance can be improved a lot by using broadcasting rdd. -- This message was sent by Atlassian JIRA (v6.3.4#6332)