pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nandor Kollar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-4891) Implement FR join by broadcasting small rdd not making more copys of data
Date Fri, 20 Jan 2017 13:09:27 GMT

     [ https://issues.apache.org/jira/browse/PIG-4891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Nandor Kollar updated PIG-4891:
-------------------------------
    Attachment:     (was: PIG-4891_1.patch)

> Implement FR join by broadcasting small rdd not making more copys of data
> -------------------------------------------------------------------------
>
>                 Key: PIG-4891
>                 URL: https://issues.apache.org/jira/browse/PIG-4891
>             Project: Pig
>          Issue Type: Sub-task
>          Components: spark
>            Reporter: liyunzhang_intel
>            Assignee: Nandor Kollar
>             Fix For: spark-branch
>
>
> In current implementation of FRJoin(PIG-4771), we just set the value of replication of
data as 10 to make the data access more efficiency because current FRJoin algrithms can be
reused in this way. We need to figure out how to use broadcasting small rdd to implement FRJoin
in current code base if we find the performance can be improved a lot by using broadcasting
rdd.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message