pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Dai (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-2348) Bloom should be able to take a relation or a file
Date Thu, 03 Nov 2011 01:51:32 GMT

     [ https://issues.apache.org/jira/browse/PIG-2348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Daniel Dai updated PIG-2348:

    Attachment: PIG-2348-0.patch

PIG-2348-0.patch is a partial patch. We also need to address backward compatibility, so it
can work both in scalar and a distributed cache file mode.
> Bloom should be able to take a relation or a file
> -------------------------------------------------
>                 Key: PIG-2348
>                 URL: https://issues.apache.org/jira/browse/PIG-2348
>             Project: Pig
>          Issue Type: Improvement
>          Components: internal-udfs
>    Affects Versions: 0.11
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>            Priority: Minor
>             Fix For: 0.11
>         Attachments: PIG-2348-0.patch
> Currently Bloom requires the user to have stored the result of a previous query using
BuildBloom to an HDFS file before it can be used.  This means the user must submit an "exec"
between this store and the filter that uses Bloom if they are to be used in the same script.
> If Bloom could take a relation as its first input (ala the relation cast to scalar) then
users would not need to put an exec in their script or manage a storage location on HDFS.
> Sometimes storing the results in the file makes sense, so we don't want to remove the
current behavior, just add another option.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message