pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (PIG-4554) Compress pig.script before encoding
Date Thu, 24 Sep 2015 22:24:04 GMT

    [ https://issues.apache.org/jira/browse/PIG-4554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14907133#comment-14907133
] 

Rohini Palaniswamy edited comment on PIG-4554 at 9/24/15 10:23 PM:
-------------------------------------------------------------------

[~sandyridgeracer],
   Can you also 
   - change the class variable script to serializedScript to make it self explanatory like
truncatedScript. 
   - The truncated script should be in its original form and not encoded. You will have to
remove below lines

{code}
// XML parser cann't handle certain characters, including
        // the control character (&#1). Use Base64 encoding to
        // get around this problem
+        this.truncatedScript = new String(Base64.encodeBase64(script.getBytes()));
{code}

and change

script = (script.length() > maxScriptSize) ? script.substring(0, maxScriptSize)
                                                   : script;

to 

this.truncatedScript = (script.length() > maxScriptSize) ? script.substring(0, maxScriptSize)
                                                   : script;




was (Author: rohini):
[~sandyridgeracer],
   Can you also 
   - change the variable script to serializedScript to make it self explanatory
   - The truncated script should be in its original form and not encoded. You will have to
remove below lines

{code}
// XML parser cann't handle certain characters, including
        // the control character (&#1). Use Base64 encoding to
        // get around this problem
+        this.truncatedScript = new String(Base64.encodeBase64(script.getBytes()));
{code}

and change

script = (script.length() > maxScriptSize) ? script.substring(0, maxScriptSize)
                                                   : script;

to 

this.truncatedScript = (script.length() > maxScriptSize) ? script.substring(0, maxScriptSize)
                                                   : script;



> Compress pig.script before encoding
> -----------------------------------
>
>                 Key: PIG-4554
>                 URL: https://issues.apache.org/jira/browse/PIG-4554
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.14.0
>            Reporter: Rohini Palaniswamy
>            Assignee: Sandeep Samdaria
>              Labels: newbie
>             Fix For: 0.16.0
>
>         Attachments: PIG-4554-2.patch, PIG-4554.patch
>
>
>   Currently we truncate the pig script (maxScriptSize = 10240) and base64 encode it and
store in config. We should remove the truncation and store the full script by compressing
and then doing base64 encoding. We already do that for udfcontext, etc. It will save space
as it will compress really well and will also give the full pig script while debugging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message