hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aihua Xu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HIVE-10754) new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
Date Thu, 18 Jun 2015 21:34:00 GMT

     [ https://issues.apache.org/jira/browse/HIVE-10754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aihua Xu updated HIVE-10754:
----------------------------
    Description: 
Some older version of new Job() seems not implemented properly, which causes the following
issue:
{noformat}
Create table tbl1 (key string, value string) stored as rcfile;
Create table tbl2 (key string, value string);
insert into tbl1 values( '1', '111');
insert into tbl2 values('1', '2');
{noformat}

Pig script:
{noformat}
src_tbl1 = FILTER tbl1 BY (key == '1');
prj_tbl1 = FOREACH src_tbl1 GENERATE
           key as tbl1_key,
           value as tbl1_value,
           '333' as tbl1_v1;
           
src_tbl2 = FILTER tbl2 BY (key == '1');
prj_tbl2 = FOREACH src_tbl2 GENERATE
           key as tbl2_key,
           value as tbl2_value;
           
dump prj_tbl1;
dump prj_tbl2;
result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
prj_result = FOREACH result 
      GENERATE  prj_tbl1::tbl1_key AS key1,
                prj_tbl1::tbl1_value AS value1,
                prj_tbl1::tbl1_v1 AS v1,
                prj_tbl2::tbl2_key AS key2,
                prj_tbl2::tbl2_value AS value2;
               
dump prj_result;
{noformat}

The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  
Replace all the deprecated new Job() with Job.getInstance().


  was:
{noformat}
Create table tbl1 (key string, value string) stored as rcfile;
Create table tbl2 (key string, value string);
insert into tbl1 values( '1', '111');
insert into tbl2 values('1', '2');
{noformat}

Pig script:
{noformat}
src_tbl1 = FILTER tbl1 BY (key == '1');
prj_tbl1 = FOREACH src_tbl1 GENERATE
           key as tbl1_key,
           value as tbl1_value,
           '333' as tbl1_v1;
           
src_tbl2 = FILTER tbl2 BY (key == '1');
prj_tbl2 = FOREACH src_tbl2 GENERATE
           key as tbl2_key,
           value as tbl2_value;
           
dump prj_tbl1;
dump prj_tbl2;
result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
prj_result = FOREACH result 
      GENERATE  prj_tbl1::tbl1_key AS key1,
                prj_tbl1::tbl1_value AS value1,
                prj_tbl1::tbl1_v1 AS v1,
                prj_tbl2::tbl2_key AS key2,
                prj_tbl2::tbl2_value AS value2;
               
dump prj_result;
{noformat}

The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  We need to clone
the job instance in HCatLoader.



> new Job() is deprecated. Replaced all with Job.getInstance() for Hcatalog
> -------------------------------------------------------------------------
>
>                 Key: HIVE-10754
>                 URL: https://issues.apache.org/jira/browse/HIVE-10754
>             Project: Hive
>          Issue Type: Sub-task
>          Components: HCatalog
>    Affects Versions: 1.2.0
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>         Attachments: HIVE-10754.patch
>
>
> Some older version of new Job() seems not implemented properly, which causes the following
issue:
> {noformat}
> Create table tbl1 (key string, value string) stored as rcfile;
> Create table tbl2 (key string, value string);
> insert into tbl1 values( '1', '111');
> insert into tbl2 values('1', '2');
> {noformat}
> Pig script:
> {noformat}
> src_tbl1 = FILTER tbl1 BY (key == '1');
> prj_tbl1 = FOREACH src_tbl1 GENERATE
>            key as tbl1_key,
>            value as tbl1_value,
>            '333' as tbl1_v1;
>            
> src_tbl2 = FILTER tbl2 BY (key == '1');
> prj_tbl2 = FOREACH src_tbl2 GENERATE
>            key as tbl2_key,
>            value as tbl2_value;
>            
> dump prj_tbl1;
> dump prj_tbl2;
> result = JOIN prj_tbl1 BY (tbl1_key), prj_tbl2 BY (tbl2_key);
> prj_result = FOREACH result 
>       GENERATE  prj_tbl1::tbl1_key AS key1,
>                 prj_tbl1::tbl1_value AS value1,
>                 prj_tbl1::tbl1_v1 AS v1,
>                 prj_tbl2::tbl2_key AS key2,
>                 prj_tbl2::tbl2_value AS value2;
>                
> dump prj_result;
> {noformat}
> The expected result is (1,111,333,1,2) while the result is (1,2,333,1,2).  
> Replace all the deprecated new Job() with Job.getInstance().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message